Bug 119211 - Lines are too long and characters overlap in this Chinese text file
Summary: Lines are too long and characters overlap in this Chinese text file
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
(earliest affected) release
Hardware: All All
: medium normal
Assignee: Not Assigned
Keywords: bibisectRequest, regression
Depends on:
Blocks: CJK
  Show dependency treegraph
Reported: 2018-08-11 02:55 UTC by Aron Budea
Modified: 2018-08-11 16:52 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:

Screenshot (90.05 KB, image/png)
2018-08-11 02:55 UTC, Aron Budea

Note You need to log in before you can comment on or make changes to this bug.
Description Aron Budea 2018-08-11 02:55:31 UTC
Created attachment 144100 [details]

Open attachment 143967 [details] from bug 119096 by selecting format 'Text - Choose Encoding', and choose character set 'Chinese simplified (GB18030)'.

=> Some lines are too long, and characters overlap (see screenshot), both with OpenGL enabled and disabled.

Observed using LO & / Windows 7.
Looks fine in
=> regression
Comment 1 Aron Budea 2018-08-11 10:59:36 UTC
Interestingly in the bibisect repo this was related to a Noto font update in 6.1:

However, in the dialog I didn't change the language, only the character set. Setting the language to Chinese (simplified) as well produces correct-looking document. Should that be needed, though? (I admit not setting both doesn't make a lot of sence)
Comment 2 V Stuart Foote 2018-08-11 16:52:26 UTC

Your screen clip shows issues both with line formatting--but more so with font fallback--which seems the issue here and makes sense as Liberation Mono has limited coverage of CJK. And, font fallback got major rework 5.2 -> 5.3 with implementation of more DirectWrite use with HarfBuzz.

Believe the "Text" import defaults to using the "Preformatted Text" style form default template and is assigned Liberation Mono.

When I open this, if I choose the GB-18030 encoding and a CJK friendly font other than Liberation Mono--which has limited coverage of the glyphs needed for correct layout--e.g. NSimSun, no issues with fallback or line formatting.

On import believe empty paragraphs/crlf--those with no GB-10830 glyphs--are picked up in the UI local as set.  So in an en-US local, I get a mix of Asian - Chinese (simplified) paragraphs and Western - Default English (USA).

IMHO this seems correct, but a CJK user would need to confirm the system local is picked up as the Default for empty paragraphs.

Perhaps retest and adjust the font for the import?

Version: (x64)
Build ID: 0a1a4ffb4f87adff7fbbbc60202b6a0e42fedd0c
CPU threads: 4; OS: Windows 10.0; UI render: default; 
TinderBox: Win-x86_64@42, Branch:master, Time: 2018-08-08_23:17:46
Locale: en-US (en_US); Calc: group threaded