Bug 96457 - Unicode ExtB+ chars are badly handled in Writer
Summary: Unicode ExtB+ chars are badly handled in Writer
Status: RESOLVED INVALID
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
4.3.0.4 release
Hardware: All Windows (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: CJK Fonts
  Show dependency treegraph
 
Reported: 2015-12-13 15:22 UTC by Danny Lin
Modified: 2017-04-27 16:02 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
Fig 1. Badly displayed at the first paste (86.85 KB, image/jpeg)
2015-12-13 15:22 UTC, Danny Lin
Details
Fig 2. Badly displayed when exported to PDF (54.57 KB, image/jpeg)
2015-12-13 15:31 UTC, Danny Lin
Details
Fig 3. Badly displayed when saved as .docx (79.38 KB, image/jpeg)
2015-12-13 15:37 UTC, Danny Lin
Details
Fig 4. Badly displayed when saved as .doc (77.41 KB, image/jpeg)
2015-12-13 15:38 UTC, Danny Lin
Details
The demo .odt (7.58 KB, application/vnd.oasis.opendocument.text)
2015-12-13 16:31 UTC, Danny Lin
Details
extB characters in different fonts (47.03 KB, image/png)
2015-12-22 16:23 UTC, Hiunn-hué
Details
extB characters saved as docx (43.96 KB, image/png)
2015-12-22 16:27 UTC, Hiunn-hué
Details
Test file open with LODev 5.3 (91.29 KB, image/png)
2016-11-08 16:02 UTC, Volga
Details
Fig 4. Badly displayed when making them vertically (108.74 KB, image/png)
2016-12-24 19:02 UTC, Volga
Details
2nd demo .odt (8.99 KB, application/vnd.oasis.opendocument.text)
2016-12-24 19:03 UTC, Volga
Details
3rd demo .odt (12.42 KB, application/vnd.oasis.opendocument.text)
2017-04-27 11:19 UTC, Volga
Details
Fig 1. Well displayed with HanaMinA/HanaMinB (163.33 KB, image/png)
2017-04-27 11:22 UTC, Volga
Details
Fig 2. Badly displayed with MingLiU (166.38 KB, image/png)
2017-04-27 11:24 UTC, Volga
Details
Fig 3. Badly displayed with SimSun (164.91 KB, image/png)
2017-04-27 11:25 UTC, Volga
Details
Fig 4. Test for printing (243.67 KB, application/x-zip-compressed)
2017-04-27 11:43 UTC, Volga
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Danny Lin 2015-12-13 15:22:14 UTC
Created attachment 121258 [details]
Fig 1. Badly displayed at the first paste

Unicode ExtB+ chars are not well handled, especially when exported to the PDF.
Comment 1 Danny Lin 2015-12-13 15:25:13 UTC
Comment on attachment 121258 [details]
Fig 1. Badly displayed at the first paste

Unicode ExtB+ chars are not displayed correctly after pasted into the Writer (The original text of this figure is "大黃䗪蟲丸狐𧌒䒜𧀬")

They are displayed correctly after saving the file and then reloading it.
Comment 2 Danny Lin 2015-12-13 15:28:56 UTC Comment hidden (obsolete)
Comment 3 Danny Lin 2015-12-13 15:31:44 UTC
Created attachment 121259 [details]
Fig 2. Badly displayed when exported to PDF

Unicode ExtB+ chars are badly displayed when they are exported to a PDF file.
Comment 4 Danny Lin 2015-12-13 15:37:08 UTC
Created attachment 121261 [details]
Fig 3. Badly displayed when saved as .docx

Unicode ExtB+ chars are badly displayed when they are exported to a docx file.
Comment 5 Danny Lin 2015-12-13 15:38:05 UTC
Created attachment 121262 [details]
Fig 4. Badly displayed when saved as .doc

Unicode ExtB+ chars are badly displayed when they are exported to a .doc file.
Comment 6 Danny Lin 2015-12-13 16:31:53 UTC
Created attachment 121267 [details]
The demo .odt
Comment 7 Buovjaga 2015-12-16 11:13:29 UTC
I get the paste error, but they don't display correctly after save + reload.

Win 7 Pro 64-bit, Version: 5.0.3.2 (x64)
Build ID: e5f16313668ac592c1bfb310f4390624e3dbfb75
Locale: fi-FI (fi_FI)

Version: 5.2.0.0.alpha0+
Build ID: 014633f83e44ae8ba33087b6f38e8e253e281969
CPU Threads: 4; OS Version: Windows 6.1; UI Render: default; 
TinderBox: Win-x86@62-merge-TDF, Branch:MASTER, Time: 2015-12-15_06:21:44
Locale: fi-FI (fi_FI)

4.3.0.1
Comment 8 Hiunn-hué 2015-12-22 16:23:06 UTC
Created attachment 121498 [details]
extB characters in different fonts

I tried with different fonts, saved as odf file and reloaded. The result was the same.
Comment 9 Hiunn-hué 2015-12-22 16:27:33 UTC
Created attachment 121499 [details]
extB characters saved as docx
Comment 10 Volga 2016-11-08 16:02:36 UTC
Created attachment 128578 [details]
Test file open with LODev 5.3

Comfirmed with LODev 5.3 even if I use SimSun instead.

Version: 5.3.0.0.alpha1+
Build ID: 05d2a66955f8a6552a79696474386ca9f45f9ef2
CPU Threads: 4; OS Version: Windows 6.2; UI Render: default; Layout Engine: new; 
TinderBox: Win-x86@62-merge-TDF, Branch:MASTER, Time: 2016-11-07_23:34:48
Locale: zh-CN (zh_CN); Calc: group
Comment 11 Volga 2016-12-14 03:42:08 UTC
Comfirmed with LODev 5.3.0 beta2, even if I try to use Hanazono Mincho.

Version: 5.3.0.0.beta2
Build ID: a7e30712ad6d8bc9286007b37aa581983e0caba3
CPU Threads: 4; OS Version: Windows 6.2; UI Render: default; Layout Engine: new; 
Locale: zh-CN (zh_CN); Calc: group
Comment 12 Volga 2016-12-24 19:02:12 UTC
Created attachment 129925 [details]
Fig 4. Badly displayed when making them vertically

This problem also appearing in vertical layout, and several characters displaying with fallback fonts looks sideway instead of upright.
Comment 13 Volga 2016-12-24 19:03:12 UTC
Created attachment 129926 [details]
2nd demo .odt
Comment 14 Volga 2017-04-01 17:49:56 UTC
Test again on LO 5.3.2.1, with attachment 121267 [details] ExtB chars are still badly placed, with attachment 129926 [details] they looks not upright.

Version: 5.3.2.1 (x64)
Build ID: 7f6693c08cc110b9721245fc4bd4f1712e0c086c
CPU Threads: 4; OS Version: Windows 6.19; UI Render: default; Layout Engine: new; 
Locale: zh-CN (zh_CN); Calc: CL
Comment 15 Mark Hung 2017-04-02 00:46:06 UTC
Although all the cases are related to EXT-B and Writer, each of them has a different symptom ( FILESAVE to different formats, FILEOPEN, exporting to pdf, vertical formatting). They even have different test files. So I suggest to create different issues, and maybe make this a meta issue, to make triaging and fixing the issue easier.
Comment 16 Buovjaga 2017-04-17 11:44:07 UTC
Volga: can you create the separate reports per Mark's suggestion?
Comment 17 Volga 2017-04-27 11:19:58 UTC
Created attachment 132891 [details]
3rd demo .odt

I made this file for further test. In this case I found characters encoded in CJK Ext B and D blocks are badly rendered when they are displaying with MingLiU and SimSun, but with Habazono fonts they looks pretty. This bug seems is font issue, but both MingLiU and SimSun are system fonts on Windows, and these characters are well handeled in Notepad, Wordpad, so this can be consider as LibreOffice bug.

Version: 5.3.3.1 (x64)
Build ID: 46360c72c4823cefeaa85af537fba22bd568da7e
CPU Threads: 4; OS Version: Windows 6.19; UI Render: default; Layout Engine: new; 
Locale: zh-CN (zh_CN); Calc: group

Habazono fonts are available here: http://fonts.jp/hanazono/
For non Chinese versions of MS Windows, you can get MingLiU and SimSun following these instructions:
https://answers.microsoft.com/en-us/windows/forum/windows_10-start/some-fonts-are-missing-after-upgrade/95839dfa-0df2-4bc0-875a-fd6b57e61fe4?page=1&auth=1
Comment 18 Volga 2017-04-27 11:22:55 UTC
Created attachment 132892 [details]
Fig 1. Well displayed with HanaMinA/HanaMinB

This is a snapshot taken with attachment #132891 [details]
Comment 19 Volga 2017-04-27 11:24:21 UTC
Created attachment 132893 [details]
Fig 2. Badly displayed with MingLiU

This is a snapshot taken with attachment #132891 [details]
Comment 20 Volga 2017-04-27 11:25:48 UTC
Created attachment 132894 [details]
Fig 3. Badly displayed with SimSun

This is a snapshot taken with attachment #132891 [details]
Comment 21 Volga 2017-04-27 11:43:53 UTC
Created attachment 132896 [details]
Fig 4. Test for printing

This test file made with PDFCreator, a virtual printer on Windows.
Comment 22 Volga 2017-04-27 11:46:37 UTC
(In reply to Buovjaga from comment #16)
> Volga: can you create the separate reports per Mark's suggestion?

Not sure. This bug seems is caused by bad font loading that affect specific fonts, and then FILESAVE and FILEOPEN got affects.
Comment 23 Mark Hung 2017-04-27 12:05:03 UTC
Once information started to accumulate, we no longer know whether we are talking about the original issue or the new one. Then it will not be possible to close the bug. Even worse, the bug might be neglected by developers directly because of lack of readability.

From Danny Lin's comment, it can at least split into

1. Writer PDF Exporting issue for ExtB+ characters.
2. Writer Paste issue of ExtB+ characters.
3. FILESAVE: Export issue for docx format.
4. FILESAVE: Export issue for doc format.

Steps and the source file to reproduce for each has to be clarified. I close this issue as Invalid. Each issue can be reported independently if someone cares.
Comment 24 V Stuart Foote 2017-04-27 12:19:59 UTC
Volga, how would this be an a11y issue in any sense?

Bit of a stretch to lump Font rendering to screen as affecting Assistive Technology tools. If supported, screen readers will sound the glyphs as recorded by their Unicode points.
Comment 25 Volga 2017-04-27 13:49:31 UTC
(In reply to V Stuart Foote from comment #24)
> Volga, how would this be an a11y issue in any sense?
> 
> Bit of a stretch to lump Font rendering to screen as affecting Assistive
> Technology tools. If supported, screen readers will sound the glyphs as
> recorded by their Unicode points.

I made a mistake. I think this should be font loading issue.
Comment 26 Danny Lin 2017-04-27 16:02:53 UTC
Reported 2 sub-issues in the separated threads.

https://bugs.documentfoundation.org/show_bug.cgi?id=107487

https://bugs.documentfoundation.org/show_bug.cgi?id=107488