Bug 138503 - Missing bold formatting of non-Latin Unicode text (MS bug?)
Summary: Missing bold formatting of non-Latin Unicode text (MS bug?)
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-11-26 06:42 UTC by Johannes Wülk
Modified: 2021-08-20 13:39 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:
Regression By:


Attachments
Sample text, first two lines should show in bold. (22.50 KB, application/msword)
2020-11-26 06:49 UTC, Johannes Wülk
Details
Comparison LibreOffice 7.2 master and MSO 2010 (108.94 KB, image/png)
2020-12-11 18:59 UTC, Xisco Faulí
Details
Problem persists, also on Windows (14.29 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2020-12-13 16:27 UTC, Johannes Wülk
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Johannes Wülk 2020-11-26 06:42:35 UTC
Description:
LibreOffice Writer shows wrong formatting of non-latin Unicode characters (Unicode Range: 1200–137F used for languages like Tigrinya or Amharic). For example bold text is shown as regular/non-bold text. This problem doesn't seem to depend on a specific file type, as I tested it on doc, docx and odt files.
Other Office Suites like Google Docs, MSO or Free Office show the formatting correctly. As I like to use Libre Office as my main and only Office suite, I hope this problem can be easily fixed. Thank you in advance.

Actual Results:
Bold text is shown regular/non-bold.

Expected Results:
Correct Formatting overall.


Reproducible: Always


User Profile Reset: No



Additional Info:
Version: 7.0.3.1
Build ID: 00(Build:1)
CPU threads: 4; OS: Linux 5.8; UI render: default; VCL: gtk3
Locale: de-DE (de_DE.UTF-8); UI: en-US
7.0.3-2
Calc: threaded
Comment 1 Johannes Wülk 2020-11-26 06:49:13 UTC
Created attachment 167582 [details]
Sample text, first two lines should show in bold.
Comment 2 Xisco Faulí 2020-12-11 18:59:22 UTC
Created attachment 168075 [details]
Comparison LibreOffice 7.2 master and  MSO 2010

At least in MSO 2010, the first line is bold but not the second, in LibreOffice none is bold
Comment 3 Xisco Faulí 2020-12-11 18:59:41 UTC
Reproduced in

Version: 7.2.0.0.alpha0+
Build ID: 84af20ef3ea72190784e9e7be820684c2558ba8c
CPU threads: 4; OS: Linux 5.7; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded
Comment 4 Xisco Faulí 2020-12-11 19:01:03 UTC
Also reproduced in

Version: 4.3.0.0.alpha1+
Build ID: c15927f20d4727c3b8de68497b6949e72f9e6e9e

Version 4.1.0.0.alpha0+ (Build ID: efca6f15609322f62a35619619a6d5fe5c9bd5a)
Comment 5 Xisco Faulí 2020-12-11 19:01:35 UTC
@Justin, I thought you might be interested in this issue
Comment 6 Justin L 2020-12-11 19:45:22 UTC
Can you provide a sample in DOCX format? Is this document created in LO??

(I could probably just convert what your DOC to DOCX - but I'd prefer to have a copy from you that you state is bad.) [DOCX lets me more easily inspect whether the bold attribute really is specified.]
Also, can we get a confirmation that this is or is not a Linux only problem?

I can confirm that Word 2003 indicates that the first paragraph of the DOC file is bold. When I round-trip it (in Word 2003) to DOCX, then the second line becomes bold - including in LibreOffice.  If I set as bold in LibreOffice and round-trip it, the bold stays.
Comment 7 Ming Hua 2020-12-11 20:19:13 UTC
(In reply to Justin L from comment #6)
> Also, can we get a confirmation that this is or is not a Linux only problem?
FWIW, I can reproduce the "first line not shown as bold" on Windows:
Version: 6.4.7.2 (x64)
Build ID: 639b8ac485750d5696d7590a72ef1b496725cfb5
CPU threads: 2; OS: Windows 10.0 Build 18363; UI render: default; VCL: win; 
Locale: zh-CN (zh_CN); UI-Language: en-US
Calc: threaded

...not sure if that answers your question, though.

Let me know if there are more round-trip testing needed in LO.  However I don't have MS Office.
Comment 8 Johannes Wülk 2020-12-13 16:27:29 UTC
Created attachment 168130 [details]
Problem persists, also on Windows

@Justin L: I attached a sample docx file: Doc1_createdbyMSO.docx which was created in MS Office (regular word document, not open xml). This is not only a Linux problem since I tested in on Windows 7 Ultimate with Libre Office Portable and the 2 first lines do NOT show as bold either. In MS Office they show as bold, and also in other Office Suites, as mentioned before. When saved in LO, it's fine/ stays bold after opening. This is a crucial problem, as I use this vocationally. Maybe there are some issues when reading in the docx file in LO Writer? Thank you for your time.
Comment 9 Justin L 2020-12-14 07:33:38 UTC
OK - this is a bit interesting.
The paragraph does have bold specified
<w:rPr>
  <w:rFonts w:ascii="Nyala" w:eastAsia="Nyala" w:hAnsi="Nyala" w:cs="Nyala"/>
  <w:b/>
</w:rPr>
<w:t>እዚ</w:t>

And Bold IS being set to the character properties for Western and Asian Fonts.
But CTL Font is not being set to bold - because there is a specific w:bCs for that, and that is not set anywhere.

17.3.2.1 b (Bold)
This element specifies whether the bold property shall be applied to all non-complex script characters in the contents of this run when displayed in a document.

So perhaps MS Word doesn't consider these Unicode characters to be a CTL language? (It seems like it should be considered complex to me, since even the default style sets Bottom to Top - Left to Right <w:textDirection w:val="btLr"/>
Comment 10 Justin L 2020-12-14 09:00:27 UTC
My guess (and that's all it is from me since I really know nothing about the topic of Complex Text Layout/Complex Script) is that this really is a Microsoft bug.
Comment 11 Johannes Wülk 2021-01-20 02:51:55 UTC
I hear you. But then why do the other above-mentioned Office Suites display the formatting correctly?
Comment 12 Justin L 2021-01-20 04:59:56 UTC
(In reply to Johannes Wülk from comment #11)
> I hear you. But then why do the other above-mentioned Office Suites display
> the formatting correctly?

That's easy. If they have no idea what complex script is, then they will not realize that the bold that is specified is not supposed to be applied to these characters.