Bug 95060 - Combining Diacritics will not stack when exported to PDF.
Summary: Combining Diacritics will not stack when exported to PDF.
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: graphics stack (show other bugs)
Version:
(earliest affected)
4.5.0.0.alpha0+ Master
Hardware: Other macOS (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:5.3.0
Keywords: filter:pdf, regression
Depends on: HarfBuzz
Blocks: PDF-Export
  Show dependency treegraph
 
Reported: 2015-10-14 16:02 UTC by rand_burgett
Modified: 2016-11-06 18:35 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments
odc file showing the diacritic problem (72.34 KB, application/vnd.oasis.opendocument.text)
2015-10-14 16:02 UTC, rand_burgett
Details
test kit with ODT and several PDFs (156.62 KB, application/zip)
2015-10-14 23:38 UTC, V Stuart Foote
Details
LO bundled Gentium font seems to combine diacritics correctly (49.74 KB, image/png)
2015-10-15 21:06 UTC, V Stuart Foote
Details
testkit for Chris (292.41 KB, application/zip)
2016-02-23 15:12 UTC, V Stuart Foote
Details
pdf file exported from 5.2.2.2 with diacritics messed up.(lines 3-7) (52.46 KB, application/pdf)
2016-10-10 18:34 UTC, rand_burgett
Details

Note You need to log in before you can comment on or make changes to this bug.
Description rand_burgett 2015-10-14 16:02:36 UTC
Created attachment 119611 [details]
odc file showing the diacritic problem

LibreOffice positions and stacks combining diacritics perfectly within Writer and Draw. However, when the page is exported to PDF any letter that has more than one diacritic it prints all diacritics on top of each other rather than stacking them as it should.  The result is that the words are totally unreadable in all languages that use that type of letters.

I have attached a .odt file that when opened in Writer will show the diacritics displayed correctly using the arial font.  The document also has a image of what it looks like after being exported to PDF. If you look at that image you see all diacritics messed up if there is more than one diacritic. 

Diacritics use anchor points in the font to position themselves correctly. The PDF export is ignoring these anchor points when exporting so they end up all piled up on top of each other.

Combining diacritics are marks that sit above or below letters and help that letter to form a new letter. They are used very extensively in different languages though out the World, and PDF export cannot be used by any those languages unless diacritics are positioned correctly.
Comment 1 V Stuart Foote 2015-10-14 23:38:56 UTC
Created attachment 119628 [details]
test kit with ODT and several PDFs
Comment 2 V Stuart Foote 2015-10-14 23:45:55 UTC
Confirming on Windows 8.1 Enterprise 64-bit en-US with
Version: 5.1.0.0.alpha1+
Build ID: ec66ad595393312525937b628297cb3494776e1f
TinderBox: Win-x86@62-merge-TDF, Branch:MASTER, Time: 2015-10-13_13:10:33
Locale: en-US (en_US)

As in the attached test kit...

Actually when printing/exporting original test document to PDF the Arial font produces reasonable output of the Combining Diacritical glyph subset to PDF.

But the same combined diacritics in Linux Biolinum G, Liberation Sans, Linux Libertine G are not well spaced on document canvas, nor when printed, or gs printed to PDF or exported to PDF.

So there is an issue in composing text with combined glyphs--setting NEW and pointing to the graphics stack, but guess printing and PDF would be equally applicable.
Comment 3 rand_burgett 2015-10-15 19:31:37 UTC
If Arial works good on 5.1.0.0.alpha+ then you may not have a problem here.  The reason I specifically used Arial in my example is because Arial has anchor points to lock the diacritics in place if the software uses them.  However, many fonts do not have anchor points in them, so if you add a diacritic to a font without anchor points, then they may float anywhere, because there is no anchor point to lock the diacritic place.  If you are going to use diacritics you have to use a font that has anchor points.
Comment 4 V Stuart Foote 2015-10-15 19:55:45 UTC
(In reply to rand_burgett from comment #3)
> If Arial works good on 5.1.0.0.alpha+ then you may not have a problem here. 
> The reason I specifically used Arial in my example is because Arial has
> anchor points to lock the diacritics in place if the software uses them. 

Yes I was using Arial on Windows (8.1 & 10) with current builds of master, and there had good on canvas composition and when printing.

> However, many fonts do not have anchor points in them, so if you add a
> diacritic to a font without anchor points, then they may float anywhere,
> because there is no anchor point to lock the diacritic place.  If you are
> going to use diacritics you have to use a font that has anchor points.

Does anyone know which if any of our bundled fonts have the "anchor points" indicated?  I'd have expected our Linux Biolinum G and Linux Libertine G Graphite fonts would have been well structured for this level of typography composition.
Comment 5 V Stuart Foote 2015-10-15 21:06:38 UTC
Created attachment 119653 [details]
LO bundled Gentium font seems to combine diacritics correctly

(In reply to V Stuart Foote from comment #4)
> Does anyone know which if any of our bundled fonts have the "anchor points"
> indicated?  I'd have expected our Linux Biolinum G and Linux Libertine G
> Graphite fonts would have been well structured for this level of typography
> composition.

While we don't bundle Graphite version, the Gentium Basic and Gentium Book Basic both compose the diacritics pretty well--on screen canvas and when printed or exported. Attached clip is from Export to PDF.
Comment 6 raal 2015-10-21 16:40:59 UTC
regression,works correct in LO 4.4.2.2, linux
Comment 7 raal 2015-10-22 08:31:55 UTC
bibisect-win32-5.0, oldest version contains bug too.
git checkout oldest: Version: 4.5.0.0.alpha0+
Build ID: 57d6b92b69a31260dea0d84fcd1fc5866ada7adb
Comment 8 Chris Sherlock 2016-02-20 06:26:03 UTC
I'l try to bibisect on the 4.4 series. Unfortunately there are problems with an international cable to Australia causing severe slowdowns with me getting the bibisect-44max.tar.xz
Comment 9 Chris Sherlock 2016-02-21 22:09:31 UTC
I've bibisected through the 4.4.x series, and I can't ever get it to export the document correctly!
Comment 10 Chris Sherlock 2016-02-21 22:45:16 UTC
No, I take that back. It seems that the PDF viewer on Ubuntu is the problem. Drat.
Comment 11 Chris Sherlock 2016-02-22 02:45:05 UTC
OK, this is an odd one. I've tested this on master, the diactric doesn't stack, but instead it appears on teh right of the lower diacritic. When I export to PDF and use xdg-open on the PDF, same deal. When I open that same PDF on OS X, the diacritics are stacked correctly. 

Can you export that odt file as a PDF and and attach the PDF again on the latest version of LO?
Comment 12 V Stuart Foote 2016-02-23 15:12:14 UTC
Created attachment 122908 [details]
testkit for Chris

@Chris, *

Not sure if you meant me, or @raal--anyhow attached an ODT (added a couple fonts beyond the odt in attachment 119628 [details]) and PDF exports -- 5.1.1.1 and a master from 2016-02-22 (with and without OpenGL)

Stuart
Comment 13 Xisco Faulí 2016-09-11 21:56:23 UTC Comment hidden (obsolete)
Comment 14 Xisco Faulí 2016-10-10 11:13:41 UTC Comment hidden (obsolete)
Comment 15 rand_burgett 2016-10-10 18:24:49 UTC
I just got this notice that this was resolved.  It is not resolved it is still doing the same thing it was when I first created the bug.

I am running LibreOffice version 5.2.2.2 on OS X El Capitan version 10.11.6.

Steps to reproduce:
1. Load the attached file into Writer:  "odc file showing diacritic problem"
2. Read the file so you know and can see what is happening.
3. Notice all the diacritics look good in LibreOffice in the top list. (lines 3-7)
4. In the Writer toolbar click the "Export as PDF" button.
5. Open the newly created pdf file in any pdf reader.
6. Notice the the diacritics now on top of each other and all messed up in the top list. (lines 3-7)

From Writer, if you print and then created a pdf from the print dialog it works perfect without messing up the diacritics.  It is only the "Export as PDF" that messes them up.
Comment 16 rand_burgett 2016-10-10 18:34:58 UTC
Created attachment 127930 [details]
pdf file exported from 5.2.2.2 with diacritics messed up.(lines 3-7)

pdf file exported from 5.2.2.2 with diacritics messed up.(lines 3-7)
Comment 17 Buovjaga 2016-10-24 11:53:57 UTC
Correcting status to NEW.
Comment 18 Xisco Faulí 2016-10-24 14:52:38 UTC
I can't reproduce it in

Version: 5.2.0.0.alpha1+
Build ID: 5b168b3fa568e48e795234dc5fa454bf24c9805e
CPU Threads: 4; OS Version: Linux 4.2; UI Render: default; 
Locale: ca-ES (ca_ES.UTF-8)

or

Version: 5.3.0.0.alpha0+
Build ID: 8974b0fafb18f9dd3f2c0e175a3255b80e4c249e
CPU Threads: 4; OS Version: Linux 4.2; UI Render: default; 
Locale: ca-ES (ca_ES.UTF-8); Calc: group

has it been fixed in master?
Comment 19 Dennis Roczek 2016-10-24 15:27:53 UTC
Version: 5.2.2.2
Build-ID: 8f96e87c890bf8fa77463cd4b640a2312823f3ad
CPU-Threads: 4; BS-Version: Windows 6.2; UI-Render: Standard; 
Gebietsschema: de-DE (de_DE); Calc: group

same here: NOREPRO

as it was already confirmed (and thus independently confirmed that the bug existed) this seemed to be fixed in the meantime.

@rand: please next time: Please indicate somehow better that the box is a screenshot. I was first confirmed as I was the only one you could still confirm it. ;-)
Comment 20 V Stuart Foote 2016-10-24 15:40:20 UTC
See the ODT document in attachment 122908 [details]

On Windows 10 Pro 64-bit (1607) en-US with
Version: 5.2.2.2 (x64)
Build ID: 8f96e87c890bf8fa77463cd4b640a2312823f3ad
CPU Threads: 8; OS Version: Windows 6.19; UI Render: GL; 
Locale: en-US (en_US); Calc: group

The composition on Export to PDF matches composition on the Writer canvas.

On Windows 10--Arial and SegoeUI both compose the combining diacritics correctly.  The on canvas composition is affected by use of CPU, default GPU, or OpenGL rendering--but the export to PDF is correct.

Fonts other than Arial and SegoeUI do not combine/position the diacritics correctly on document canvas or on export to PDF--they overlap.

So, the composition is correct--but as in comment 3 we have issues with the font metrics.

@Khaled, thoughts? Should this be OS X only? Also, with the 5.3.0/master SAL_USE_COMMON_LAYOUT active the shaping gets clobbered.
Comment 21 Alex Thurgood 2016-10-24 15:44:07 UTC
Confirming PDF export problem with test document on

Version: 5.2.1.2
Build ID: 31dd62db80d4e60af04904455ec9c9219178d620
Threads CPU : 8; Version de l'OS :Mac OS X 10.12; UI Render : par défaut; 
Locale : fr-FR (fr_FR.UTF-8); Calc: group
Comment 22 Alex Thurgood 2016-10-24 15:45:23 UTC
I see what Rand sees in his PDF export, the stacked diacritics are overlaid on each other.
Comment 23 ⁨خالد حسني⁩ 2016-10-24 23:36:29 UTC
I can confirm the original issue on macOS, and also confirm that it is fixed with bug 89870.

I’m marking this macOS only, if someone has a similar bug on other platforms it should be reported separately since the code path (before bug 89870) is likely different.