Bug 118405 - Export as PDF shows gaps in Greek words when using TexGyre fonts
Summary: Export as PDF shows gaps in Greek words when using TexGyre fonts
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Printing and PDF export (show other bugs)
Version:
(earliest affected)
6.0.3.2 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: filter:pdf
Depends on:
Blocks: PDF-Export
  Show dependency treegraph
 
Reported: 2018-06-27 06:45 UTC by JesseSteele
Modified: 2019-12-10 04:00 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description JesseSteele 2018-06-27 06:45:19 UTC
Description:
Some fonts in other languages don't render correctly in pdf.

I created a repo with a fully-reproduced problem, example files, and in-depth description here:
https://github.com/JesseSteele/pdf-bug

Steps to Reproduce:
1. Open "Problem example.odt" in Writer
2. Click the icon "Export as PDF"
3. Look in the file and see problems with Greek word spaces on pages 3-4

Actual Results:
Greek words have unexpected spaces using the TexGyre Pagella font, but not the standard Roman font. The unexpected spaces even push letters OUTSIDE the margins, but the words should at least wrap. Seems double trouble to me.

Expected Results:
Greek words should not have unexpected spaces.


Reproducible: Always


User Profile Reset: No



Additional Info:
Exporting this to PDF in Calligra Words (using 'custom size' paper) does not have this problem. Perhaps Calligra is on to something and "doing it correctly". (This is how I solved the problem and published on Amazon print on demand.)

But, as the repo explains, Calligra messes up pages when exporting from .doc. LibreOffice produces identical results in Export to PDF, from both .odt and .doc, thanks for the consistency guys, really!

I also had the same problem using lowriter in the terminal.
Comment 1 Julien Nabet 2018-06-27 09:30:50 UTC
On which env are you?

Could you give a try to last stable LO version 6.0.5?

If it doesn't work, it could be interesting, just for the test, to know if the bug is still reproduceable on a daily build from master branch (see https://dev-builds.libreoffice.org/daily/master/Win-x86_64@42/current/).
Comment 2 JesseSteele 2018-06-27 16:26:23 UTC
Yo, Julien,

I'm cool, so I'm using Ubuntu 18.04.

I added your repo and tried 6.0.5. Same problem.

I hacked your W!nd@w$ link and the .deb package didn't install.

But, just clone or DL the repo and try it yourself, either the .doc or .odt files and see if you get the spaces after the Greek letters that look like "a" or "w"...

WRONG:
πολλῶ ν

ὑδά των
OR
ὑ δά των

RIGHT:
πολλῶν

ὑδάτων

git clone http://github.com/jessesteele/pdf-bug
Comment 3 JesseSteele 2018-06-27 16:27:56 UTC
You may need TexGyre installed to use real fonts for real publishing...

sudo apt install tex-gyre
Comment 4 Julien Nabet 2018-06-27 16:44:58 UTC
On pc Debian x86-64 with master sources updated today, I git cloned your repo.

I noticed that LO pdf showed pairs of words in columns, on Calligra pdf it's not the case.
Then I opened the odt, I saw these same colums.
I exported the file on pdf, same columns too.

I didn't see:
πολλῶ ν
ὑδά των
OR
ὑ δά των

I suppose I missed something but don't know what.
Comment 5 JesseSteele 2018-06-27 17:27:48 UTC
The "columns" are a product of "justify text". In Pagella, each line ends with a different word.

Look on PDF page 9 (labeled page 1)

You will see the little "v" thing by itself. It's not supposed to be. It's been pushed out into the margin where it shouldn't be able to be.

The u da twv should be all one word.

Looking down at PDF pages 10-11 (labeled 2-3) the Greek letters are grouped as all one word as they should be.

You can see this in the file I put in the GitHub repo.
Comment 6 Julien Nabet 2018-06-27 17:58:16 UTC
On "Problem example DOC - via LibreOffice.pdf", 9th page of pdf, I see:
hudatoen/ ὑ δάάτων

whereas in "Problem example DOC - via Calligra.pdf", I see:
hudatoen/ὑδάτων

Ok so Calligra seems ok, not LO.

Strangely, I've opened "Problem example.doc" with 6.0.5.2 LO Debian (testing) package and exported it, I got:
hudatoen/ ὑ δάτων (from copy paste)
but I see:
hudatoen/ὑδάτων
as if the copy paste would add some spaces.

I noticed on Evince that when highlighting "ὑδάτων", "ά" was replaced by a square only on LO PDF export.

In brief, I don't reproduce exactly what you describe but I got a pb too.

Miklos: thought you might be interested in this one since it concerns PDF export unless it's more about fonts rendering, in that case, Khaled may help here?
Comment 7 JesseSteele 2018-06-27 18:13:16 UTC
Yeah! Julien, you are seeing it, and the other problems.

I can explain. I studied Greek in college and I get the font...

Those accent mark things above and under the letters are rendered by fonts similarly to how "ff" might be a single character. I don't know base-language level font encoding, but UTF-8 might regard ὑ or ά as actually two separate characters, like combo letters in creating fonts.

Dealing with that "correctly" is probably where the problem begins.

Summary:

TexGyre might not do their fonts "correctly", but that Roman default font does.

Calligra got TexGyre to render correctly... BUT, Amazon's print on demand says the Calligra .pdf file is broken...

(Calligra rant: in Calligra, .doc > .pdf, it messed up the page numbers; .odt > .pdf Calligra was great, but Amazon said it was broken or someth. So, not even Calligra is perfect in this.)

That's what I know.

Cheers and kudos all!
Comment 8 JesseSteele 2018-07-05 18:05:03 UTC
For what it's worth, I've had problems with ghostscript processing .pdf files to get CMYK working correctly. Scribus won't import the same files it could before. Command line gs makes a .pdf blank. These don't work:

http://wiki.inkscape.org/wiki/index.php/ExportPDFCMYK
http://zeroset.mnim.org/2014/07/14/save-a-pdf-to-cmyk-with-inkscape/

...Just in case you're using ghostscript and it's giving everyone problems. :-)
Comment 9 Buovjaga 2018-07-15 16:25:59 UTC
NEW as Julien's confirmation was confirmed.
Comment 10 QA Administrators 2019-12-10 04:00:09 UTC
Dear JesseSteele,

To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year.

There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present.

If you have time, please do the following:

Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/

If the bug is present, please leave a comment that includes the information from Help - About LibreOffice.
 
If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice.

Please DO NOT

Update the version field
Reply via email (please reply directly on the bug tracker)
Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not 
appropriate in this case)


If you want to do more to help you can test to see if your issue is a REGRESSION. To do so:
1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from http://downloadarchive.documentfoundation.org/libreoffice/old/

2. Test your bug
3. Leave a comment with your results.
4a. If the bug was present with 3.3 - set version to 'inherited from OOo';
4b. If the bug was not present in 3.3 - add 'regression' to keyword


Feel free to come ask questions or to say hello in our QA chat: https://kiwiirc.com/nextclient/irc.freenode.net/#libreoffice-qa

Thank you for helping us make LibreOffice even better for everyone!

Warm Regards,
QA Team

MassPing-UntouchedBug