Bug Hunting Session
Bug 109268 - Combining Cyrillic characters overlap when placed above each other e.g. лⷣ҇
Summary: Combining Cyrillic characters overlap when placed above each other e.g. лⷣ҇
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
(earliest affected) release
Hardware: All All
: medium normal
Assignee: Not Assigned
Depends on:
Blocks: Font-Rendering HarfBuzz-regressions
  Show dependency treegraph
Reported: 2017-07-22 10:22 UTC by Francis Butler
Modified: 2017-10-30 07:28 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:

Sample of Cyrillic combining character overlap (8.32 KB, application/vnd.oasis.opendocument.text)
2017-07-22 10:27 UTC, Francis Butler
Screenshot with Ponomar Unicode (20.71 KB, image/png)
2017-08-08 08:49 UTC, Volga
Screenshot with BukyVede and BabelPad (14.70 KB, image/png)
2017-08-11 10:50 UTC, Volga
How the text looks with GPOS table removed fom the font (15.89 KB, image/png)
2017-10-30 07:28 UTC, Khaled Hosny

Note You need to log in before you can comment on or make changes to this bug.
Description Francis Butler 2017-07-22 10:22:15 UTC
In earlier versions (through 5.2?), when I typed multiple combining Cyrillic characters over regular Cyrillic characters in an Old East Slavic text that I am editing, the characters would be visible as a stack (the regular character below, the first combining character above it, and the second combining character above the first). Now when I type the same way the combining characters overlap with each other. (I initially switched to LibreOffice because in Microsoft Word combining characters overlapped each other as they now do in LibreOffice).

Steps to Reproduce:
1. Enter any standard Cyrillic character (I use the font Bukyvede, which has many rare characters.)
2. Enter one combining Cyrillic character above the first one. (I typically use a small combining letter, e.g. ⷮ . (I use PopChar to enter letters with no keyboard equivalent.)
3. Enter a second combining Cyrillic character (typically ҇ )above the first one.

Actual Results:  
Combining characters overlap each other above standard character

Expected Results:
First combining character appears above standard character; second combining character appears above first combining character. (I do not mind if the higher character interferes with the line of print above it; the solution is to make a greater space between lines.)

Reproducible: Always

User Profile Reset: No

Additional Info:
While this problem may affect only a small number of users, it is a major problem for anyone attempting to reproduce an Early Slavic text.

User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_5) AppleWebKit/603.2.4 (KHTML, like Gecko) Version/10.1.1 Safari/603.2.4
Comment 1 Francis Butler 2017-07-22 10:27:09 UTC
Created attachment 134779 [details]
Sample of Cyrillic combining character overlap
Comment 2 V Stuart Foote 2017-07-22 15:59:42 UTC
Not sure how the script is supposed to handle them but with Segoe UI while they combine we do seem to have misalignment with multiple combining glyphs. 

Combining characters in these Unicode blocks:

рⷱ҇е прⷪ҇ркъ лⷮ҇ from attachment 134779 [details]

U+0440U+2df1U+0487U+0435  -- рⷱ҇е
U+043fU+0440U+2deaU+0487U+0440U+043aU+044a --  прⷪ҇ркъ
U+043bU+2deeU+0487 -- лⷮ҇

U+a69bU+a67cU+a674  -- ꚛ꙼ꙴ
Comment 3 V Stuart Foote 2017-07-22 16:35:44 UTC
Using BabelPad, a Unicode 9 compliant editor, selecting multiple Cyrillic combining glyphs stack as OP requires.  

This Unicode.org technical note L2/15-002 seems germane, and has a few more examples:
Comment 4 Volga 2017-08-08 08:49:46 UTC
Created attachment 135263 [details]
Screenshot with Ponomar Unicode

With PonomarUnicode.otf (NOT .ttf, availabe at http://sci.ponomar.net/fonts.html) I found all of then except лⷮ҇ have such problem, and I found the shape of ⷮ (U+2DEE) designed not suitable to combine with ҇ (U+0435). With BukyVede font I found all Combining Cyrillic letters overlapped with U+0435 even if U+2DEE looks just like small т, so I can conclusion this is font issue, and I suggest you can contect designers of BukyVede font for this.
Comment 5 Francis Butler 2017-08-08 10:15:44 UTC
Volga suggests that the problem is with the font Bukyvede. However, if I understand Volga correctly, ҇ does not combine with ⷮ above a third letter in the font Ponomar or  presumably, in any other Slavic font the contains the combining characters. This would mean that the bug is shared by all such Slavic fonts. The problem with this response is that the characters combined perfectly well in OfficeLibre 5, and even in old versions of Microsoft Word. It appears that some "improvement" in OfficeLibre has, as a side effect, prevented the combination from working in any font. I would guess that the "improvement" was intended to prevent large stacks of letters from overlapping with the line above the stacks. Perhaps the "improvement" is even recommended in some technical UNIX document. This does not mean that the result is not a bug, since it ruins the functionality of a large group of fonts that have been functional (with updates) for decades.
Comment 6 Francis Butler 2017-08-08 12:08:19 UTC
I note that, as Volga remarks, the characters do combine properly in ponomar ttf but not ponomar otf . However, in light of my previous remarks, it still seems that the bug is in LibreOffice, and not in the many fonts that have suddenly become problematic.
Comment 7 Volga 2017-08-11 10:50:51 UTC
Created attachment 135457 [details]
Screenshot with BukyVede and BabelPad

I tested on both BabelPad and LibreOffice, these fonts performanced the same to me, so I can sure this font is a bit buggy.
Comment 8 Khaled Hosny 2017-10-30 07:24:02 UTC
This is a font bug, LibreOffice does not interfere with mark positioning provided by the font, however this font does not provide any mark positioning at all and relies solely on heuristics applied by the text layout system.

We use HarfBuzz which have such heuristics, but it applies them only when the font does not have GPOS table (glyph positioning table) but this font has one that does kerning only and no mark positioning.

It is possible that the system text layout libraries that we used before applied these heuristics more liberally. But font designers should not be depending on such heuristics since they are neither standardized nor well documented.
Comment 9 Khaled Hosny 2017-10-30 07:28:37 UTC
Created attachment 137364 [details]
How the text looks with GPOS table removed fom the font

Here is how the text looks when I remove the GPOS using FontTools Python library:

$ ttx -x GPOS BukyVede-Regular.ttf
$ ttx BukyVede-Regular.ttx

and then using the BukyVede-Regular#1.ttf instead of the old file.