Bug 63054 - DRAW FILEOPEN PDF produced by AutoCAD fails with General Error message
Summary: DRAW FILEOPEN PDF produced by AutoCAD fails with General Error message
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: Other All
: medium normal
Assignee: Mike Kaganski
URL:
Whiteboard: target:5.0.0
Keywords:
Depends on:
Blocks:
 
Reported: 2013-04-03 04:18 UTC by Mike Kaganski
Modified: 2015-04-29 07:44 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
Testcase (4.01 KB, application/x-zip-compressed)
2013-04-03 04:18 UTC, Mike Kaganski
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Kaganski 2013-04-03 04:18:08 UTC
Created attachment 77352 [details]
Testcase

PDFs created by AutoCAD 2012 cannot be imported to Draw.

Steps to reproduce.
1. Try to import "layers.pdf" or "nolayers.pdf" from the attachment.

Expected result:
File with a circle is imported correctly

Actual result:
"General I/O Error". File is not imported.

This error does not happen if this file has been processed by a PDF authoring tool, such as PDFill. The attachment contains "layers_pdfill.pdf" and "nolayers_pdfill.pdf" that have been resaved using this tool, and they are imported correctly.
Comment 1 Rainer Bielefeld Retired 2013-04-03 05:11:25 UTC
Already [Reproducible] with Server Installation of "LibreOffice 3.3.3  German UI/Locale [OOO330m19 (Build:301) tag libreoffice-3.3.3.1] on German WIN7 Home Premium (64bit) and OOo with PDF Import Extension Version 1.0.4.

So this PDF import problem is inherited from OOo.

As expected it's also impossible to insert those documents as OLE objects

PDF-SAM, GS have no problems with the documents
Comment 2 Alexandr 2014-08-13 12:21:39 UTC
Reproducible with LibreOffice 4.2.5 and 4.3.1.1 on Debian.
Comment 3 Carlo Strata 2014-08-28 12:10:14 UTC
I confirm this bug also in LibreOffice 4.3.1.2 (final 4.3.1) win32 on Windows 8.1, 64 bit, both of them italian GUIed.

I also try open the AutoCAD native PDF file (PDF application: "AutoCAD 2012 - Russian 2012 (18.2s (LMS Tech))" and PDF autor: "pdfplot10.hdi 10.2.205.0") on Adobe Acrobat Reader (AAR) 11.0.08 win32 and then simply "File -> Save As..." them. This may be an easy workaround, at the moment.

Those saved as AAS file seem to import very well/perfectly on above LibreOffice instance. The layers.pdf file, e.g., step from 1.34 kbyte (AutoCAD original) to 5.42 kbyte (AAR saved as); compared to 1.49 kbyte of the PDFill version.
Comment 4 Mike Kaganski 2015-04-28 15:38:40 UTC
The problem is in pdf_string_parser::operator(): pdfparse.cxx line 119 (http://opengrok.libreoffice.org/xref/core/sdext/source/pdfimport/pdfparse/pdfparse.cxx#119)
This line is used to skip escaped braces, like

(\))

It pre-increments the scanner to swallow the backslash, and after that, the scanner is incremented again (normally) on line 124.

The operator++ for boost spirit classic scanner does two things:
1. Advances the scanner;
2. Skips whitespace.

So, if the parsed string has this form:

(\ )

i.e. <left parenthesis><backslash><space><right parenthesis>
then the first increment (line 119, condition = backslash) skips TWO characters at once, and the next increment skips the normal closing parenthesis. Thus, the parsing continues. This gives "incorrect" PDF structure, i.e. the check in pdfparse.cxx line 575 gives false, thus the whole PDF load fails up to sfxbasemodel.cxx line 1929 (http://opengrok.libreoffice.org/xref/core/sfx2/source/doc/sfxbasemodel.cxx#1929), where the error is displayed.

I'll try to prepare a patch for that case.
Comment 5 Mike Kaganski 2015-04-28 16:39:02 UTC
A patch is submitted to gerrit: https://gerrit.libreoffice.org/15562
Comment 6 Commit Notification 2015-04-29 07:43:56 UTC
Mike committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=fa4071bb522d7aad069ca24bafedb597455e95b0

tdf#63054: pdf_string_parser incorrectly handles escapes

It will be available in 5.0.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.