Created attachment 77352 [details] Testcase PDFs created by AutoCAD 2012 cannot be imported to Draw. Steps to reproduce. 1. Try to import "layers.pdf" or "nolayers.pdf" from the attachment. Expected result: File with a circle is imported correctly Actual result: "General I/O Error". File is not imported. This error does not happen if this file has been processed by a PDF authoring tool, such as PDFill. The attachment contains "layers_pdfill.pdf" and "nolayers_pdfill.pdf" that have been resaved using this tool, and they are imported correctly.
Already [Reproducible] with Server Installation of "LibreOffice 3.3.3 German UI/Locale [OOO330m19 (Build:301) tag libreoffice-3.3.3.1] on German WIN7 Home Premium (64bit) and OOo with PDF Import Extension Version 1.0.4. So this PDF import problem is inherited from OOo. As expected it's also impossible to insert those documents as OLE objects PDF-SAM, GS have no problems with the documents
Reproducible with LibreOffice 4.2.5 and 4.3.1.1 on Debian.
I confirm this bug also in LibreOffice 4.3.1.2 (final 4.3.1) win32 on Windows 8.1, 64 bit, both of them italian GUIed. I also try open the AutoCAD native PDF file (PDF application: "AutoCAD 2012 - Russian 2012 (18.2s (LMS Tech))" and PDF autor: "pdfplot10.hdi 10.2.205.0") on Adobe Acrobat Reader (AAR) 11.0.08 win32 and then simply "File -> Save As..." them. This may be an easy workaround, at the moment. Those saved as AAS file seem to import very well/perfectly on above LibreOffice instance. The layers.pdf file, e.g., step from 1.34 kbyte (AutoCAD original) to 5.42 kbyte (AAR saved as); compared to 1.49 kbyte of the PDFill version.
The problem is in pdf_string_parser::operator(): pdfparse.cxx line 119 (http://opengrok.libreoffice.org/xref/core/sdext/source/pdfimport/pdfparse/pdfparse.cxx#119) This line is used to skip escaped braces, like (\)) It pre-increments the scanner to swallow the backslash, and after that, the scanner is incremented again (normally) on line 124. The operator++ for boost spirit classic scanner does two things: 1. Advances the scanner; 2. Skips whitespace. So, if the parsed string has this form: (\ ) i.e. <left parenthesis><backslash><space><right parenthesis> then the first increment (line 119, condition = backslash) skips TWO characters at once, and the next increment skips the normal closing parenthesis. Thus, the parsing continues. This gives "incorrect" PDF structure, i.e. the check in pdfparse.cxx line 575 gives false, thus the whole PDF load fails up to sfxbasemodel.cxx line 1929 (http://opengrok.libreoffice.org/xref/core/sfx2/source/doc/sfxbasemodel.cxx#1929), where the error is displayed. I'll try to prepare a patch for that case.
A patch is submitted to gerrit: https://gerrit.libreoffice.org/15562
Mike committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=fa4071bb522d7aad069ca24bafedb597455e95b0 tdf#63054: pdf_string_parser incorrectly handles escapes It will be available in 5.0.0. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.