Bug 85174 - PDF import of special characters (Algebra ) broken
Summary: PDF import of special characters (Algebra ) broken
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: Other All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:5.0.0
Keywords:
Depends on:
Blocks:
 
Reported: 2014-10-18 13:28 UTC by Jouni Järvinen
Modified: 2015-10-25 22:08 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
2 PNGs showing the same cheat sheet correctly and incorrectly, plus both PDFs (514.09 KB, application/x-xz)
2014-10-18 13:28 UTC, Jouni Järvinen
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jouni Järvinen 2014-10-18 13:28:25 UTC
Created attachment 108028 [details]
2 PNGs showing the same cheat sheet correctly and incorrectly, plus both PDFs

While trying to answer http://ask.libreoffice.org/en/question/41291/how-to-delete-pdf-pages-using-writer/ I found out that LO has always reproducible incomplete PDF export -and- import support, making the imported PDF look too messed up to be useful, and showing up as messed up as after importing, confirmed by Foxit Reader.

Attached XZ-compressed (LZMA2) TAR file. On Window$ need 7-Zip >=9.22 Beta. Linux needs xz-utils >=5.0.0 (could work with older, I can't test).
Comment 1 Cor Nouws 2014-10-18 20:18:38 UTC
Hi Jouni,

Thanks for your report.
I think that the export is incomplete, due to the fact that the import is incomplete, yes?
I see the sheet has lost of Algebra characters. 
I guess it been like this since OOo.

Set to  new
Regards,
Cor
Comment 2 Jouni Järvinen 2014-10-18 21:10:02 UTC
Both import and export, at least in the case of that algebra, is incomplete/broken.
Comment 3 Jouni Järvinen 2014-10-18 23:56:36 UTC
Yes, you're right it's an issue with algebra, cuz these PDFs I created by using Foxit Reader's print-into-PDF plugin all show up correctly.

However, http://www.tldp.org/LDP/intro-linux/intro-linux.pdf doesn't show up correctly, one reason being Draw using different font from the one used by the file.

Btw, the attachment is showing up as 'text/plain', but it's a .tar.xz, no clue about the MIME type.
Comment 4 Cor Nouws 2014-10-19 19:09:58 UTC
I think this should be another component
Comment 5 Cor Nouws 2014-10-19 19:13:09 UTC
and maybe it is related to bug 84584, where also special font rendering is not good
Comment 6 vvort 2015-01-20 08:19:18 UTC
Displacement of characters fixed here:
https://gerrit.libreoffice.org/14029

Empty rectangles instead of brackets - is another issue - problems with Symbol font.
Comment 7 Commit Notification 2015-01-20 13:56:28 UTC
Vort committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=7818c73445ad8e08e7ee51e2cc3a4b4d5a798ac4

fdo#85174 PDF Import: fix character positions

It will be available in 4.5.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 8 Caolán McNamara 2015-01-20 15:21:56 UTC
should the poppler patch also go upstream to poppler ? (esp as distros build against system poppler typically)
Comment 9 Philipp Weissenbacher 2015-01-20 23:03:38 UTC
Comment on attachment 108028 [details]
2 PNGs showing the same cheat sheet correctly and incorrectly, plus both PDFs

Sorry for the noise. MIME type was wrong.
Comment 10 vvort 2015-01-21 04:33:34 UTC
This change in poppler is specific to LibreOffice PDF import plugin.
Plugin needs some different input data, than default poppler provides.
Comment 11 vvort 2015-01-21 05:23:01 UTC
I understood the problem.
I will try to revert +getCharSpace change without poppler hacking.
Comment 12 vvort 2015-01-21 06:57:32 UTC
Second try:
https://gerrit.libreoffice.org/14066
Comment 13 Commit Notification 2015-01-21 15:33:03 UTC
Vort committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=df54862ec61c81a39b7ccfadc292b5bf859f45fa

fdo#85174 PDF Import: fix character positions without modifying poppler

It will be available in 4.5.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 14 Jouni Järvinen 2015-10-25 22:08:35 UTC
My bad, this report slipped from my fingers.

Tested fixed in 4.4.5, the last 4.x portable version. 5.0.2 RC1 never loads the intro-linux.pdf, but that's another issue.