Bug 49431 - Extensioname: PDF import has incompatibility problems with PDF format
Summary: Extensioname: PDF import has incompatibility problems with PDF format
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Extensions (show other bugs)
Version:
(earliest affected)
3.5.2 release
Hardware: x86 (IA32) Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: BSA target:4.3.0
Keywords:
Depends on:
Blocks:
 
Reported: 2012-05-03 07:49 UTC by Frank
Modified: 2015-05-30 12:33 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
One of many PDF documents that gets screwed up. (117.76 KB, application/pdf)
2012-05-03 07:49 UTC, Frank
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Frank 2012-05-03 07:49:35 UTC
Created attachment 60973 [details]
One of many PDF documents that gets screwed up.

Problem description: 
I import a PDF document into Draw and the document gets screwed up (as expected - since OO has many problems and cross-platform importing a PDF correctly is a difficult and easily underestimated task which is very likely beyond the capabilities of those who claim 100% compatibility).

Steps to reproduce:
1. Start LibreOffice
2. Open the PDF document attached

Current behavior:
Several lines get stacked on each other.

Expected behavior:
The imported document should look like the original PDF.

Platform (if different from the browser): 
Ubuntu 12.04, LibreOffice 3.5.2.2.
But it also happens in Windows 7, OpenOffice 3.3.0
Browser: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:12.0) Gecko/20100101 Firefox/12.0
Comment 1 Jean-Baptiste Faure 2012-05-06 09:19:39 UTC
Reproducible with current master (Build ID: 32af02b) and LO 3.5.3.2 under Ubuntu 11.10 x86_64.

Do you have the possibility to choose the version of the PDF format in which the file is exported ? For example PDF/A-1a ?

Best regards. JBF
Comment 2 Frank 2012-05-06 09:31:43 UTC
(In reply to comment #1)
> Do you have the possibility to choose the version of the PDF format in which
> the file is exported ? For example PDF/A-1a ?

I am not sure if I understand your question.
I just found this PDF on the web and wanted to edit it.
I didn't create nor export it.
Do you think converting the PDF format to PDF/A-1a could help?
Comment 3 Jean-Baptiste Faure 2012-05-06 09:58:16 UTC
(In reply to comment #2)
> (In reply to comment #1)
> > Do you have the possibility to choose the version of the PDF format in which
> > the file is exported ? For example PDF/A-1a ?
> 
> I am not sure if I understand your question.
> I just found this PDF on the web and wanted to edit it.
> I didn't create nor export it.

Ah, ok, I guessed that you produced this file yourself with MS-Word.

> Do you think converting the PDF format to PDF/A-1a could help?
I do not know if it is possible to convert this PDF 1.6 file to PDF/A-1a. My question was motivated by the fact that PDF 1.6 is, AFAIK, not the standardized PDF format.

Best regards. JBF
Comment 4 Frank 2012-05-07 01:57:16 UTC
I also reported it here: https://issues.apache.org/ooo/show_bug.cgi?id=119312
Comment 5 Rodger 2012-05-14 13:07:34 UTC
Hi !
I have the same problem in LibreOffice 3.5.3.2 under Ubuntu 10.04.

I cannot open any pdf files in LibreOffice, even in Draw. It showed up some strange symbols like 
%PDF-1.4
%äüöß
2 0 obj
<</Length 3 0 R/Filter/FlateDecode>>
stream
x�%��
#A#��y�Ԃkfn�`Y�@#��##�S�#���wT#I�#:�[^ h-��y���`��y�矙�E�.i0�s�s�awT�G\*�m}�of����o#�M+#�o#�L��Or�2ˌ#��#�
endstream

I tried in File menu, select Export as PDF, clicking in Embed OpenDocuments file but it didn't work. I cannot open any .pdf files.

I searched for libreoffice PDF import but it is not showing up under Tools>Extension Manager. Besides, I searched  Libreoffice PDF Import in Ubuntu Software Center but it is not available.

I tried to install OpenOffice import pdf but it didn' work.

I'll try to remove and reinstall to the Standard Libreoffice version.

I posted this at http://en.libreofficeforum.org/node/3065#comment-16730
Comment 6 Jean-Baptiste Faure 2012-05-14 21:41:52 UTC
Please, do not modify version number: it shows the first version in which the issue has been found.

@Rodger: your problem is different, it seems that, in your case, pdfimport extension, is not installed. In this bug report, pdfimport opens the file but the result is messed up.

Best regards. JBF
Comment 7 vvort 2014-03-31 05:08:52 UTC
Text positioning problem is fixed here:
https://gerrit.libreoffice.org/8800
Comment 8 Commit Notification 2014-03-31 15:19:44 UTC
Vort committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=2498e4725e4a8c3a8193e61618d314837d8db180

fdo#49431 PDF Import: Improve line and space detection algorithm



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 9 Samuel Mehrbrodt (CIB) 2014-04-01 09:46:19 UTC
Thanks for your great work, Vort!