Bug 51957 - When using .otf (OpenType PostScript) fonts, em-dashes become en-dashes in exported .pdf files
Summary: When using .otf (OpenType PostScript) fonts, em-dashes become en-dashes in ex...
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Printing and PDF export (show other bugs)
Version:
(earliest affected)
3.4.0 release
Hardware: Other All
: medium normal
Assignee: Caolán McNamara
URL:
Whiteboard: target:3.7.0 target:3.6.1
Keywords:
Depends on:
Blocks:
 
Reported: 2012-07-10 18:36 UTC by Roman Eisele
Modified: 2012-07-26 17:27 UTC (History)
0 users

See Also:
Crash report or crash signature:


Attachments
Test kit for bug 51957 (.odt file, screenshot, .pdf files) (664.50 KB, application/zip)
2012-07-10 18:41 UTC, Roman Eisele
Details
Another test kit, confirming that the bug is reproducible with Writer on Windows (233.97 KB, application/zip)
2012-07-11 14:19 UTC, Roman Eisele
Details
Calc test kit (495.11 KB, application/zip)
2012-07-11 15:36 UTC, Roman Eisele
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Roman Eisele 2012-07-10 18:36:23 UTC
At least on MacOS X 10.6.8 (Intel), German UI,
all LibreOffice versions I can test:
* LibreOffice 3.4.0, OOO340m1 (Build:12)
* LibreOffice 3.4.6, OOO340m1 (Build:602)
* LibreOffice 3.5.5.3 (Build-ID: 7122e39-92ed229-498d286-15e43b4-d70da21)
* LibreOffice 3.6.0.0.beta3 (Build ID: 3e2b862),
and also
* Apache OpenOffice 3.4.0, AOO340m1 (Build: 9590) - Rev. 1327774
suffer from the following problem.

Steps to reproduce:
1) Create a simple Writer (.odt) document
2) Type some text containing some em-dashes ('—', Unicode: U+2014),
   and, for comparison, some en-dashes ('–', Unicode: U+2013).
3) Set the font of the text to a font in .otf font format
   (a OpenType font with PostScript curves). Freely available .otf fonts
   to test with include (I list only some high-quality fonts!):
   -- Alegreya and Alegreya SC
   -- Asana Math
   -- EB Garamond
   -- Linux Biolinum O
   -- Linux Libertine O
   -- Latin Modern Roman
   -- Latin Modern Sans
   -- Old Standard
4) Save the document in .odt format (optional?!).
5) Export the document to a .pdf file.
6) Open the .pdf file. You will notice:
   * that every em-dash has been replaced with an en-dash
   * that there is a bigger space after every em-dash which 'compensates'
     for the 'missing length' (in proportional fonts, an em-dash
     is always longer than an en-dash);
   * this shows that the metrics (width) has been taken from the real em-dash,
     but the glyph is just an en-dash.

I have done some more testing which makes clear the the language settings of the document and of the individual lines of text don't have any influence on this issue. The same is true for the paragraph alignment (left or justified makes no difference).

To show that *all* .otf fonts (OpenType fonts with PostScript curves) are affected, but *only* .oft fonts, I have created a simple sample document which I will attach to this bug report. While it looks fine in all LibreOffice versions, and shows that all fonts I have used in this sample include a valid em-dash glyph, on PDF export all em-dashes of lines that use .otf fonts get converted to en-dashes, while em-dashes in other font formats (.ttf, .ttc, even Apple's .dfont) are exported correctly as em-dashes.

The most obvious lines in my sample are the lines using Linux Biolinum and Linux Libertine: the em-dashes which use LibreOffice's internal "Linux Biolinum G" and "Linux Libertine G" (graphite) fonts are exported correctly, but in the lines which use the .otf versions of that fonts, "Linux Biolinum O" and "Linux Libertine O", the em-dashes are converted to en-dashes in the .pdf file. In LibreOffice, however, these lines look absolute identical, as they should.

Why is this an important bug? Because .otf fonts are especially used in professional typesetting and by professional users; most commercial type foundrys like Adobe distribute their professional fonts in .otf font format, and it is bad that these expensive fonts are handled so poorly by LibreOffice that even the em-dashes don't work on PDF export. And even if we don't care about professional and expensive fonts, there is an incresing number of high-quality free fonts (see my list above) which is available in .otf font format.
Comment 1 Roman Eisele 2012-07-10 18:41:26 UTC
Created attachment 64080 [details]
Test kit for bug 51957 (.odt file, screenshot, .pdf files)

A simple test kit for this bug, created on MacOS X 10.6.8, including a Writer (.odt) file which includes em-dashes in different fonts, a screenshot which shows that all fonts look fine in Writer, and some .pdf files exported with different LibreOffice versions.
Comment 2 Roman Eisele 2012-07-11 14:19:10 UTC
Created attachment 64111 [details]
Another test kit, confirming that the bug is reproducible with Writer on Windows

This bug is not limited to MacOS X; it is reproducible in the same way with LibreOffice 3.5.4.2, German UI, on Windows XP.

The attachmend contains again:
* an .odt document with lines of text including em-dashes in different fonts,
* a Windows screenshot which shows that all em-dashes look OK in LibreOffice,
* and an PDF file created on Windows XP showing again that every em-dash,
  which uses a font in .otf format, is converted to an en-dash in the PDF file.

NB: to view the .odt files in both of my test kits, you may need to install (some of) the fonts which I have used. This should be no problem, because many of them are freely available on the web (see the list in my original description above); just be sure to install the .otf version of these fonts (some of them are also available in other formats). If you have any problems to get the fonts to test with just contact me. But even without installing the fonts, everybody can compare the PDF files with the screenshots ;-)
Comment 3 Roman Eisele 2012-07-11 15:36:14 UTC
Created attachment 64116 [details]
Calc test kit

And this bug is not limited to PDF export from Writer -- it is also present in Calc. Attached you will find a ZIP file containing:

* a simple Calc spreadsheet (.ods) containing samples of different fonts;
* a screenshot made with LibO 3.5.5.3 on MacOS X 10.6.8,
  showing that LibO displays all fonts correctly, including the em-dashes;
* PDF files created from the Calc file with LibO 3.5.5.3 and 3.6 beta 3,
  which both show that all em-dashes with fonts in .otf format
  become en-dashes + following empty space in the PDF export.

Again, the most impressive lines are the ones comparing the output from Linux Biolinum G to Linux Biolinum O and Linux Libertine G to Linux Libertine O: while the lines in the G fonts (.ttf + Graphite) contain correct em-dashes, the lines in the O fonts (.otf fonts) are almost identical, only that the em-dashes become en-dashes.
Comment 4 Not Assigned 2012-07-17 15:35:21 UTC
Caolan McNamara committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=a9a91490680d2778397b6b08583149e39022e692

Resolves: fdo#51957 typo, endash entered twice, 2nd should be emdash
Comment 5 Caolán McNamara 2012-07-17 15:36:17 UTC
yup, simply that endash was entered twice, second should be emdash
Comment 6 Not Assigned 2012-07-17 16:02:30 UTC
Caolan McNamara committed a patch related to this issue.
It has been pushed to "libreoffice-3-6":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=733d07a79a46a551e1eba443a81f3cc851d4cb25&g=libreoffice-3-6

Resolves: fdo#51957 typo, endash entered twice, 2nd should be emdash


It will be available in LibreOffice 3.6.1.
Comment 7 Roman Eisele 2012-07-18 09:08:22 UTC
(In reply to comment #6)
> Caolan McNamara committed a patch related to this issue.

Wow! This was fast! Thank you very much for fixing this long-standing issue!
Comment 8 Roman Eisele 2012-07-26 17:27:47 UTC
VERIFIED FIXED with LOdev 3.7.0.0.alpha0+ (Build ID: c549e1e, installation file: master~2012-07-25_02.21.07_LibO-Dev_3.7.0.0.alpha0_MacOS_x86_install_en-US.dmg) on MacOS X 10.6.8 (Intel).

Using the same sample files as before (see the attachments of this bug report), I have verified that this bug is fixed in the current master build -- all em-dashes are now em-dashes even in the exported PDF files. Thank you again!