Bug Hunting Session
Bug 104212 - Misplaced Unicode combining characters with font fallback
Summary: Misplaced Unicode combining characters with font fallback
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: graphics stack (show other bugs)
Version:
(earliest affected)
5.3.0.0.alpha1+
Hardware: All Windows (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on: DirectWrite
Blocks: Font-Rendering
  Show dependency treegraph
 
Reported: 2016-11-28 06:21 UTC by Aron Budea
Modified: 2018-11-13 14:14 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments
combining glyph behavior is font dependent (73.24 KB, image/png)
2016-11-28 14:41 UTC, V Stuart Foote
Details
LODev 5.4.0 20161128 w/OpenGL rendering (149.45 KB, image/png)
2016-11-28 20:08 UTC, V Stuart Foote
Details
LODev 5.4.0 20161128 w/Default GDI rendering (133.09 KB, image/png)
2016-11-28 20:12 UTC, V Stuart Foote
Details
test document with combining examples in Liberation Serif, Symbola, Code2000 (10.22 KB, application/vnd.oasis.opendocument.text)
2016-11-28 20:53 UTC, V Stuart Foote
Details
clip from Windows 10 showing bad handling of combining glyphs on fallback to Segoe UI Symbol (25.14 KB, image/png)
2017-11-12 17:48 UTC, V Stuart Foote
Details
Current status of bug in Liberation Serif, Symbola, and Junicode (30.29 KB, image/png)
2018-11-13 04:33 UTC, Michael von Preußen
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Aron Budea 2016-11-28 06:21:01 UTC
Below is a Windows-only issue of the new layout engine originally mentioned in [1] (using an 5.3 daily build).

Open attachment 128978 [details] (from bug 101599) in Writer.
Note how it looks like the bottom of screenshot attachment 128979 [details].
(the top screenshot is a different issue reported in bug 101599, and not relevant here)

As Khaled identified in [2]:
"The second line is:
Everything is fine, until I combin \u20dd\u0364.

U+20DD is a combining circle, and U+0364 is a combining superscript e. So seeing a circle there is expect, but it looks misplaced in your screenshot."

[1] https://bugs.documentfoundation.org/show_bug.cgi?id=101599#c7
[2] https://bugs.documentfoundation.org/show_bug.cgi?id=101599#c13
Comment 1 Aron Budea 2016-11-28 06:24:17 UTC
Confirming based on comments in bug 101599.
I'm not sure if this is to be categorized as a HarfBuzz-regression, Khaled, please adjust as you see fit.
Comment 2 V Stuart Foote 2016-11-28 13:29:09 UTC
Please note that Liberation Serif does not have a U+20DD COMBINING ENCLOSING CIRCLE glyph -- so font fallback must occur, and that seems where the problem lies.

If you assign a font that provides coverage of both the combining glyphs you get very different results!

Retest with code2000 or symbola for example.

@Khaled, interesting in that that the monospace code2000 does about the best positioning each glyph in composing the combined character. So in addition to the return of fall-back losing its placement--the actual composition into the combining character may not be using font metrics, combined glyphs don't assemble as expected.
Comment 3 V Stuart Foote 2016-11-28 14:41:48 UTC
Created attachment 129077 [details]
combining glyph behavior is font dependent

(In reply to V Stuart Foote from comment #2)
Attached screen clip showing an issue
A couple of examples to show metrics for combining glyphs... in Code2000 and in Symbola (George Douros' Ancient Scripts)

Cursor to end of line, and Alt-X convert each:

U+035cU+20e0U+20e1U+036f

xU+035cU+20e0U+20e1U+0364

nU+20ddU+0364
Comment 4 V Stuart Foote 2016-11-28 15:02:45 UTC
Also, attachment 129077 [details] clip was with OpenGL enabled with HarfBuzz. With default GDI based rendering the Code2000 (design credit to James Kass) and Symbola render the same. 

With the Arial and Segoe default rendering with GDI the "double struck" letters and other element positioning shifts a bit between the OpenGL and the GDI rendering. The rendering engine is doing different things when fall-back is involved.
Comment 5 Khaled Hosny 2016-11-28 17:31:47 UTC
Seeing a small shift in the position of combining marks when they are coming from different fonts is rather expected, since correct positioning require the glyphs to come from the same font. Even when they come from the same font, the positioning can be bad since not all fonts handle every combining mark correctly.

What is shown in attachment 128979 [details] is not expected though, the mark is shifted all the way to the beginning of the line which should not happen. Can someone else confirm this, and mat be added a screenshot here with a bigger zoom?
Comment 6 V Stuart Foote 2016-11-28 20:08:07 UTC
Created attachment 129093 [details]
LODev 5.4.0 20161128 w/OpenGL rendering
Comment 7 V Stuart Foote 2016-11-28 20:12:53 UTC
Created attachment 129095 [details]
LODev 5.4.0 20161128 w/Default GDI rendering

@Khaled, greater zoom as requested--on current 5.4.0 20161128 TB42 master--here with Default GDI rendering, and with OpenGL rendering in attachment 129093 [details]

Again layout issue placement to beginning of string looks to occur when font fall-back must be processed as with Liberation Serif. But still some differences between OpenGL and GDI rendering.
Comment 8 Khaled Hosny 2016-11-28 20:31:13 UTC
Can you also attach the document?
Comment 9 V Stuart Foote 2016-11-28 20:53:42 UTC
Created attachment 129100 [details]
test document with combining examples in Liberation Serif, Symbola, Code2000

Attached...

The original "Everything is fine until I combinU+20ddU+0364"

And each of these <alt>+X conversion combined characters per font:

U+035cU+20e0U+20e1U+036f

xU+035cU+20e0U+20e1U+0364

nU+20ddU+0364
Comment 10 Khaled Hosny 2017-11-11 15:32:58 UTC
Not a HarfBuzz or layout issue as it happens only Mac, looks like a rendering or font fallback issue.
Comment 11 Khaled Hosny 2017-11-12 11:41:47 UTC
(In reply to Khaled Hosny from comment #10)
> Not a HarfBuzz or layout issue as it happens only Mac, looks like a
> rendering or font fallback issue.

s/Mac/Windows/
Comment 12 V Stuart Foote 2017-11-12 17:46:22 UTC
Attaching a clip from current 2017-11-11 master, Windows 10 system with Symbola and Code2000 fonts installed.

In attachment 129100 [details] on Windows 8/8.1/10 builds, font fall back for Liberation Serif picks up Segoe UI Symbol for the 'n' (U+006e) inside a combining circle (U+20dd), with combining superscript e (U+0364)--but then looses its placement and actual glyph to stamp.

Don't know if this is a DirectWrite implementation issue, but is handling the result of font fallback to perform the glyph combining getting lost between D2DWriteTextOutRenderer and ExTextOutRenderer?
Comment 13 V Stuart Foote 2017-11-12 17:48:24 UTC
Created attachment 137699 [details]
clip from Windows 10 showing bad handling of combining glyphs on fallback to Segoe UI Symbol

Don't think the issue is the Segoe UI Symbol font, rather our mishandling on fallback of the combining glyphs.
Comment 14 QA Administrators 2018-11-13 03:41:37 UTC Comment hidden (obsolete)
Comment 15 Michael von Preußen 2018-11-13 04:33:43 UTC
Created attachment 146582 [details]
Current status of bug in Liberation Serif, Symbola, and Junicode

I have per comment #14 confirmed this bug as still present in the latest version of LibreOffice:

> Version: 6.0.6.2 (x64)
> Build ID: 0c292870b25a325b5ed35f6b45599d2ea4458e77
> CPU threads: 4; OS: Windows 6.3; UI render: default; 
> Locale: en-US (en_US); Calc: group

As the alignment of the characters differed somewhat for me from their appearance in attachment #137699 [details], I have attached a screenshot showing the current status of the bug in a document edited from attachment #129100 [details]—using Junicode in place of Code2000, as I don't have the latter installed.
Comment 16 V Stuart Foote 2018-11-13 14:14:57 UTC
Yes confirming for the combining character fall-back occurring with Liberation Serif paragraphs, Segoe UI is missplaced to beginning of line.

Symbola and Code 2000 rendering, both containing the combining glyphs, are correctly placed.

Windows 10 Home 64-bit (1803) en-US with Intel HD Graphics 620 and
Version: 6.2.0.0.alpha1+ (x64)
Build ID: afbfe42e63cdba1a18c292d7eb4875009b0f19c0
CPU threads: 4; OS: Windows 10.0; UI render: GL; VCL: win; 
TinderBox: Win-x86_64@42, Branch:master, Time: 2018-11-10_23:59:43
Locale: en-US (en_US); UI-Language: en-US
Calc: threaded