Bug Hunting Session
Bug 68313 - Diacritics problem with Graphite fonts
Summary: Diacritics problem with Graphite fonts
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: graphics stack (show other bugs)
Version:
(earliest affected)
4.0.4.2 release
Hardware: All All
: medium normal
Assignee: László Németh
URL:
Whiteboard: BSA target:4.2.0 target:4.1.2 target:...
Keywords: regression
Depends on:
Blocks: mab4.0
  Show dependency treegraph
 
Reported: 2013-08-20 08:13 UTC by EricP
Modified: 2016-02-24 18:46 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
One line example with missing word between brackets (16.76 KB, application/vnd.oasis.opendocument.text)
2013-08-20 08:13 UTC, EricP
Details
PDF showing correct display on v3.6.7.2 (14.09 KB, application/pdf)
2013-08-20 08:16 UTC, EricP
Details
PDF showing wrong display on v4.0.4.2 (13.67 KB, application/pdf)
2013-08-20 08:17 UTC, EricP
Details
PDF showing correct display on v3.6.7.2 (14.09 KB, application/pdf)
2013-08-20 08:18 UTC, EricP
Details
LibreOffice 4.1.1 disappearing chars due to multiple combining characters (11.00 KB, application/msword)
2013-09-03 16:03 UTC, Justin L
Details
PDF of what I see (missing characters) in LO 4.1.1 (73.08 KB, application/pdf)
2013-09-03 16:05 UTC, Justin L
Details

Note You need to log in before you can comment on or make changes to this bug.
Description EricP 2013-08-20 08:13:58 UTC
Created attachment 84314 [details]
One line example with missing word between brackets

Problem description: 

The attached .ODT file contains one line, including a table. 

In v3.6.7.2 everything shows up OK. 
In v4.1.0.4 the word between the brackets is missing. 
in v4.0.4.2 part of the word is missing. 

But if I turn on the display of non-printing characters, then I see a jumbled mess between the brackets. 

Or if I press space bar numerous times after the second bracket, the text will appear.

The text displays correctly if I copy/paste the text from the table, as below. But I have seen the same problem when typing text outside of a table.

The problem is identical on-screen or in the .PDF output.

This is an intermittent problem. Ninety-nine percent of the time things show up correctly, but some words don't show up at all, and I'm not sure why yet. This occurs with either IPA or Vietnamese characters.


Operating System: Windows 7
Version: 4.0.4.2 release
Last worked in: 3.6.7.2 rc
Comment 1 EricP 2013-08-20 08:16:40 UTC
Created attachment 84315 [details]
PDF showing correct display on v3.6.7.2
Comment 2 EricP 2013-08-20 08:17:39 UTC
Created attachment 84316 [details]
PDF showing wrong display on v4.0.4.2
Comment 3 EricP 2013-08-20 08:18:27 UTC
Created attachment 84317 [details]
PDF showing correct display on v3.6.7.2
Comment 4 Justin L 2013-09-03 16:03:58 UTC
Created attachment 85123 [details]
LibreOffice 4.1.1 disappearing chars due to multiple combining characters
Comment 5 Justin L 2013-09-03 16:05:57 UTC
Created attachment 85124 [details]
PDF of what I see (missing characters) in LO 4.1.1

This affects many languages in Sudan/South Sudan.   Multiple combining characters are common in tonal languages.
Comment 6 Justin L 2013-09-03 16:08:46 UTC
I'm marking this as critical importance, because this serious hampers our ability to migrate to LibreOffice.  We (in IT) are trying to push Open Source (Linux/LO) but this kind of regression will be disasterous on getting others to support it.
Comment 7 Rik Shaw 2013-09-03 18:22:10 UTC
I can confirm this bug.

In order to understand the problem from jluth, I copied the "ara" word (with the top and bottom diacritics) from his document, and pasted into a new document.

If the diacritics don't both show, then the SIL Andika Font can be downloaded here:

http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=Andika_download

(If on ubuntu, the fonts-sil-andika will provide the font)

Then if you *bold* the "r" in the middle of the word, the trailing "a" (again, with diacritics) is hidden.  Turning on "non printing characters" shows this letter super-imposed on the carriage return character.  Likewise, if you instead bold the first "a", a similar problem will occur.  If, however, you bold the *entire* word "ara", then no characters disappear.

This is a very strange problem to me, but it indeed is a significant blocker for many languages that have multiple diacritics on characters.

It is confirmed that the same problem exists in Windows and Linux.  It is also confirmed that the problem does *not* exist in LO 3.6.7.

It is also confirmed that 4.1.1 has the same problem.
Comment 8 Rik Shaw 2013-09-03 18:34:17 UTC
Added to MAB 4.0 (Bug 54157), also marked for all platforms (not just Windows).

Update, I have just seen it started *after* 4.0.0.3 (a colleague was able to confirm that Windows 4.0.0.3 does *not* have this problem, but again 4.1 exhibits the problem in Linux and Windows)
Comment 9 Urmas 2013-09-03 19:15:10 UTC
Only several fonts, like Linux Biolinum or Gentium are affected.
Comment 10 Urmas 2013-09-03 19:26:23 UTC
The characters which are not displayed are apparently lying closer to the end of the paragraph then a certain number of characters depending on the number of the non-spacing diacritics in text.
Comment 11 Michael Stahl (CIB) 2013-09-03 21:20:00 UTC
on Linux the only difference i see is that in the 4th attachment
the "Gentium font" example is missing the last "a"
in libreoffice-4-0 branch (also 4-1 and master) vs. LO 3.6.7.2.

Windows master is the same, "Gentium font" missing last "a" otherwise good.

cannot see any problem with the 1st attachment on the tested versions.

the "Gentium" thing appears to work in LO 4.0.3.3.

... and this is the commit that lets the "a" disappear:

commit 7d1e6cb0564a1eb886fd8f95adbcc7d8b9aa028f
Author:     László Németh <nemeth@numbertext.org>
AuthorDate: Wed May 22 09:11:13 2013 +0200

    fdo#52540 fix hyphenation of Graphite ligatures
Comment 12 László Németh 2013-09-03 23:11:46 UTC
Assigned to me. It seems, I will be able to fix both problems (the fixed hyphenation problem and the new one) checking Unicode combining diacritic characters at line breaking. Thanks for your report!

A possible workaround is to use single Unicode combining diacritical mark, eg. Ó + U+323 instead of O + U+301 + U+323.
Comment 13 Commit Notification 2013-09-06 09:19:07 UTC
Laszlo Nemeth committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=8fae91c67d3abed8158ada9ce1b0f79f3c10e165

fdo#68313 fix combining diacritics problem with Graphite fonts



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 14 martin_hosken 2013-09-06 10:14:36 UTC
> Laszlo Nemeth committed a patch related to this issue.
> It has been pushed to "master":
> 
> http://cgit.freedesktop.org/libreoffice/core/commit/
> ?id=8fae91c67d3abed8158ada9ce1b0f79f3c10e165

Any time I see hard coded Unicode points in code at this level I start to shudder violently. Especially when it seems to be a poor man's cluster identification algorithm.

If I understood libo better I might be able to advise on a more generic (and therefore better) solution. My thinking is that a better way is to work out how much context is really required (no context after hyphens, otherwise at least a cluster full - hence the +64 in the code already) and then limiting how much of the output segment we actually take so as not to repeat stuff.
Comment 15 Commit Notification 2013-09-06 14:01:54 UTC
Laszlo Nemeth committed a patch related to this issue.
It has been pushed to "libreoffice-4-1-2":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=b2cec9e9313e7bf7912e4e43a80f843019924255&h=libreoffice-4-1-2

fdo#68313 fix combining diacritics problem with Graphite fonts


It will be available already in LibreOffice 4.1.2.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 16 László Németh 2013-09-06 14:52:36 UTC
(In reply to comment #14)
Martin, my bad patches have been replaced by this one:
http://cgit.freedesktop.org/libreoffice/core/commit/?id=58e1112a6a974b96bb8595e3ee9d08e915d4fd14
It seems, this works well with both of f ligatures and combining diacritical marks, too, but likely, special scripts will depend from Unicode/ICU line breaking. A possible fix to add optional context based on the language of the line. Unfortunately, I doesn't know examples for context dependency. I have tried to test the patch with SIL Padauk, but my build doesn't known the installed font (so I added a "FIXME" to the patch). I will close this issue for Unicode combining diacritical marks, but I will open a new one for other context-dependent Graphite fonts problems in LibreOffice, if they exist. Thanks for your help!
Comment 17 Commit Notification 2013-09-06 15:41:11 UTC
Laszlo Nemeth committed a patch related to this issue.
It has been pushed to "libreoffice-4-1":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=bb09c7c14bb1caf7b08b39944bda61382b158c64&h=libreoffice-4-1

fdo#68313 fix combining diacritics problem with Graphite fonts


It will be available in LibreOffice 4.1.3.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 18 Commit Notification 2013-09-06 15:41:35 UTC
Laszlo Nemeth committed a patch related to this issue.
It has been pushed to "libreoffice-4-0":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=ac8df424f1ef09d78ae76f98fdbbf58c0dae24bd&h=libreoffice-4-0

fdo#68313 fix combining diacritics problem with Graphite fonts


It will be available in LibreOffice 4.0.6.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 19 Justin L 2013-10-07 09:23:59 UTC
Thanks.  Confirming that this is fixed in the released 4.1.2.3
Comment 20 László Németh 2013-10-07 10:50:46 UTC
(In reply to comment #19)
> Thanks.  Confirming that this is fixed in the released 4.1.2.3

Thanks, also for your feedback. Sorry for the inconvenience!