Description: Attempting to load the indicated pdf results in a pop-up saying General Error. General input/output error Steps to Reproduce: 1. wget -Otest.pdf http://www.firsttuesday.us/course/Downloads/315.pdf 2. libreoffice test.pdf Actual Results: General input-output error Expected Results: A Draw document should be opened containing content from the PDF. Reproducible: Always User Profile Reset: No Additional Info: User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:51.0) Gecko/20100101 Firefox/51.0
Confirmed in - Version: 5.4.0.0.alpha0+ Build ID: 880033edde516fc30225005245253293a6a58ba4 CPU Threads: 4; OS Version: Linux 4.8; UI Render: default; VCL: gtk3; Locale: ca-ES (ca_ES.UTF-8); Calc: group - Version: 4.4.0.0.alpha0+ Build ID: a5e137eb1d37361c60175e8fba780fc46b377a23 - LibreOffice 3.3.0 OOO330m19 (Build:6) tag libreoffice-3.3.0.4
Created attachment 131304 [details] document
** Please read this message in its entirety before responding ** To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the information from Help - About LibreOffice. If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice. Please DO NOT Update the version field Reply via email (please reply directly on the bug tracker) Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) If you want to do more to help you can test to see if your issue is a REGRESSION. To do so: 1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from http://downloadarchive.documentfoundation.org/libreoffice/old/ 2. Test your bug 3. Leave a comment with your results. 4a. If the bug was present with 3.3 - set version to 'inherited from OOo'; 4b. If the bug was not present in 3.3 - add 'regression' to keyword Feel free to come ask questions or to say hello in our QA chat: https://kiwiirc.com/nextclient/irc.freenode.net/#libreoffice-qa Thank you for helping us make LibreOffice even better for everyone! Warm Regards, QA Team MassPing-UntouchedBug
still repro in Version: 6.3.0.0.alpha0+ (x64) Build ID: 13a260f59e421f3e67845f8f2eb22b8f0f8fcaf0 CPU threads: 4; OS: Windows 10.0; UI render: GL; VCL: win; TinderBox: Win-x86_64@42, Branch:master, Time: 2019-03-11_02:46:09 Locale: ru-RU (ru_RU); UI-Language: en-US Calc: threaded
Still reproducible with: Version: 6.3.4.2 Build ID: 1:6.3.4-0ubuntu0.19.10.1 CPU threads: 4; OS: Linux 5.3; UI render: default; VCL: gtk3; Locale: en-CA (en_CA.UTF-8); UI-Language: en-US
This also fails on export. With libreoffice-7.0.0.0.beta2-lp152.937.1.x86_64, the export works. with libreoffice-7.0.0.1-941.4.x86_64, the export fails With libreoffice-7.0.0.1-941.4.x86_64: // export to epub works; libreoffice --writer --convert-to epub /home/test.docx --outdir /home/ convert /home/test.docx -> /home/test.epub using filter : EPUB Overwriting: /home/test.epub // export msword to pdf fails with following error; libreoffice --writer --convert-to pdf /home/test.docx --outdir /home/ convert /home/test.docx -> /home/test.pdf using filter : writer_pdf_Export Error: Please verify input parameters... (SfxBaseModel::impl_store <file:///home/test.pdf> failed: 0xc10(Error Area:Io Class:Write Code:16) /home/abuild/rpmbuild/BUILD/libreoffice-7.0.0.1/sfx2/source/doc/sfxbasemodel.cxx:3153 /home/abuild/rpmbuild/BUILD/libreoffice-7.0.0.1/sfx2/source/doc/sfxbasemodel.cxx:1735) // it appeared input file was in wrong place but switching order didn't help libreoffice --writer --convert-to pdf --outdir /home/ /home/test.docx convert /home/test.docx -> /home/test.pdf using filter : writer_pdf_Export Error: Please verify input parameters... (SfxBaseModel::impl_store <file:///home/test.pdf> failed: 0xc10(Error Area:Io Class:Write Code:16) /home/abuild/rpmbuild/BUILD/libreoffice-7.0.0.1/sfx2/source/doc/sfxbasemodel.cxx:3153 /home/abuild/rpmbuild/BUILD/libreoffice-7.0.0.1/sfx2/source/doc/sfxbasemodel.cxx:1735) // export msexcel to pdf fails with following error; libreoffice --writer --convert-to pdf /home/test.xlsx --outdir /home/ convert /home/test.xlsx -> /home/test.pdf using filter : writer_pdf_Export Error: Please verify input parameters... (SfxBaseModel::impl_store <file:///home/test.pdf> failed: 0xc10(Error Area:Io Class:Write Code:16) /home/abuild/rpmbuild/BUILD/libreoffice-7.0.0.1/sfx2/source/doc/sfxbasemodel.cxx:3153 /home/abuild/rpmbuild/BUILD/libreoffice-7.0.0.1/sfx2/source/doc/sfxbasemodel.cxx:1735) // export txt file to pdf fails with following error; libreoffice --writer --convert-to pdf /home/test.txt --outdir /home/ convert /home/test.txt -> /home/test.pdf using filter : writer_pdf_Export Error: Please verify input parameters... (SfxBaseModel::impl_store <file:///home/test.pdf> failed: 0xc10(Error Area:Io Class:Write Code:16) /home/abuild/rpmbuild/BUILD/libreoffice-7.0.0.1/sfx2/source/doc/sfxbasemodel.cxx:3153 /home/abuild/rpmbuild/BUILD/libreoffice-7.0.0.1/sfx2/source/doc/sfxbasemodel.cxx:1735)
I was able to open the test.pdf file originally reported (using LibreOffice 7.2.0.4 on Mac). There is an error message "This document has an invalid signature." and Show Signatures shows four signatures that can't be found. I am here because I got the same message with one of my own PDF files. I don't want to upload it here because it is 44MB and I need to redact it before sharing. It am less concerned that LibreOffice can't open it than I am that the message is "General Error. General input/output error." and I cannot figure out how to get more details. Nothing is written to stderr/stdout. There is nothing in the system log.
The FILEOPEN Input/Output error is still reproducible on master. Terminal output: warn:sdext.pdfimport.pdfparse:61348:61348:sdext/source/pdfimport/pdfparse/pdfparse.cxx:681: error got 2 stack objects in parse warn:sdext.pdfimport.pdfparse:61348:61348:sdext/source/pdfimport/pdfparse/pdfparse.cxx:684: N8pdfparse7PDFFileE warn:sdext.pdfimport.pdfparse:61348:61348:sdext/source/pdfimport/pdfparse/pdfparse.cxx:689: (type N8pdfparse7PDFFileE) warn:sdext.pdfimport.pdfparse:61348:61348:sdext/source/pdfimport/pdfparse/pdfparse.cxx:684: N8pdfparse10PDFTrailerE warn:sdext.pdfimport.pdfparse:61348:61348:sdext/source/pdfimport/pdfparse/pdfparse.cxx:689: (type N8pdfparse10PDFTrailerE) warn:sdext.pdfimport.pdfparse:61348:61348:sdext/source/pdfimport/pdfparse/pdfparse.cxx:681: error got 2 stack objects in parse warn:sdext.pdfimport.pdfparse:61348:61348:sdext/source/pdfimport/pdfparse/pdfparse.cxx:684: N8pdfparse7PDFFileE warn:sdext.pdfimport.pdfparse:61348:61348:sdext/source/pdfimport/pdfparse/pdfparse.cxx:689: (type N8pdfparse7PDFFileE) warn:sdext.pdfimport.pdfparse:61348:61348:sdext/source/pdfimport/pdfparse/pdfparse.cxx:684: N8pdfparse10PDFTrailerE warn:sdext.pdfimport.pdfparse:61348:61348:sdext/source/pdfimport/pdfparse/pdfparse.cxx:689: (type N8pdfparse10PDFTrailerE) ./instdir/program/xpdfimport has generated valid output.
(In reply to sfbarbee@gmail.com from comment #6) The export issue should be reported in a separate bug report, and you should provide test document for the export.
Can this information be any of use ? Unlike other comments of my own on other bug reports, I don't have even a bit of clue. https://opengrok.libreoffice.org/xref/core/sdext/source/pdfimport/pdfparse/pdfparse.cxx?r=776a1b9b#575 ================== parseinfo: stop = xref 0 229 (snip) ýýýý (buff=%PDF-1.6 %âãÏÓ 226 0 obj (snip) , offset = 138357), hit = true, full = false, length = 124498
can I have two trailers in a PDF?
There can be many xref in a pdf file. Question is, why is the sdext.pdfimport.pdfparse code called? We use poppler to parse pdf which then resulted in the xpdfimprt executable which then generate the token to assemble an Flat ODF to be rendered. Poppler may parse the pdf very well. Why do we parse the pdf structure on our own?
Well, see below my observations (but without a solution at this moment): sdext/source/pdfimport/pdfiadaptor.cxx PDFIRawAdaptor::importer (in line 291): calls /source/pdfimport/pdfiadaptor.cxx PDFIRawAdaptor::parse (in line 217) which calls (in line 231): sdext/source/pdfimport/wrappter/wrapper.cxx xpdf_ImportFromStream (in line 1182) xpdf_ImportFromStream copied the pdf content to a temp file because the called has passed in a file stream thus xInput.is() is true. I don't think it is necessary to make such temp file - why not pass the url of the original PDF file and then use the xpdf_ImportFromFile directly? This is a separate issue. xpdf_ImportFromStream then calls: sdext/source/pdfimport/wrappter/wrapper.cxx xpdf_ImportFromFile (in line 998) which uses the temp file as the data source xpdf_ImportFromFile then calls (in line 1020): sdext/source/pdfimport/wrappter/wrapper.cxx checkEncryption (in line 891) (Poppler has the check encryption functionnality, so why do we use our own check encryption here? I think it is because we need to show a dialog to ask for password if it is encrypted. But how about we ask poppler to check encryption, and if poppler tells it is encrypted, then we provide the password through stdin?) checkEncryption then calls (in line 901): sdext/source/pdfimport/pdfparse/pdfparse.cxx pdfparse::PDFReader::read (there are two such function, one for win32 and another for the "else". I am confused by those ifdef _WIN32 stuff, but for me on linux it is in line 608. Interestingly, there is another #ifdef _WIN32 in this block, and my program jumps to line 637 directly) Take note for the aGrammar: PDFGrammar< file_iterator<> > aGrammar( file_start ); pdfparse::PDFReader::read calls boost::spirit::classic::parse, which took several seconds (maybe a performance issue here?) But there is no exception here: boost::spirit::classic::parse( file_start, file_end, aGrammar, boost::spirit::classic::space_p ); Then, finally, in line 672 we get the nEntries: unsigned int nEntries = aGrammar.m_aObjectStack.size(); And it is 2 for this pdf which in turn does not set a pRet in line 679 block, thus in xpdf_ImportFromFile it returned False. I am not familiar with the boost::spirit::classic::parse staff, thus not sure why the aGrammar.m_aObjectStack.size() is 2. I don't think I can fix this. I am providing this just FYI.
sdext/source/pdfimport/pdfiadaptor.cxx PDFIRawAdaptor::importer (in line 291): calls /source/pdfimport/pdfiadaptor.cxx PDFIRawAdaptor::parse (in line 217) which calls (in line 231): sdext/source/pdfimport/wrappter/wrapper.cxx xpdf_ImportFromStream (in line 1182) xpdf_ImportFromStream copied the pdf content to a temp file because the caller has passed in a file stream thus xInput.is() is true. I don't think it is necessary to make such temp file - why not pass the url of the original PDF file and then use the xpdf_ImportFromFile directly? Anyway,this is a separate issue. xpdf_ImportFromStream then calls: sdext/source/pdfimport/wrappter/wrapper.cxx xpdf_ImportFromFile (in line 998) which uses the temp file as the data source xpdf_ImportFromFile then calls (in line 1020): sdext/source/pdfimport/wrappter/wrapper.cxx checkEncryption (in line 891) (Poppler has the check encryption functionality, so why do we use our own encryption checking here? I think it is because we need to show a dialog to ask for password if it is encrypted. But how about we ask poppler to check encryption, and if poppler tells it is encrypted, then we provide the password through stdin?) checkEncryption then calls (in line 901): sdext/source/pdfimport/pdfparse/pdfparse.cxx pdfparse::PDFReader::read (there are two such function, one for win32 and another for the "else". I am confused by those #ifdef _WIN32 stuff, but for me on linux it is in line 608. Interestingly, there is another #ifdef _WIN32 in this block, and my program jumps to line 637 directly) Take note of the aGrammar: PDFGrammar< file_iterator<> > aGrammar( file_start ); pdfparse::PDFReader::read then calls boost::spirit::classic::parse, which took several seconds (maybe a performance issue here?) But there is no exception here yet: boost::spirit::classic::parse( file_start, file_end, aGrammar, boost::spirit::classic::space_p ); Then, finally, in line 672 we get the nEntries: unsigned int nEntries = aGrammar.m_aObjectStack.size(); And its value is 2 for this pdf, as a result a pRet is not set in line 679 block, thus in xpdf_ImportFromFile it returned False. I am not familiar with the boost::spirit::classic::parse staff, thus not sure why the aGrammar.m_aObjectStack.size() is 2.
Below is the portion related to trailer in this pdf: <contents above ommitted> endstream endobj xref 0 4644 0000000004 65535 f 0000056752 00000 n <omitted multiple xref entries> 0000004642 65535 f trailer <</Size 4644/Root 1 0 R>> xref 0 0 trailer <</Size 4644/Prev 4950910/XRefStm 55777/Root 1 0 R/Info 373 0 R/ID[<23394E591A08E64B8237236C314F97F2><64F67A6686A94D0B8F325336D36A35E8>]>> startxref 5043836 %%EOF So yes, there are two trailers in this pdf. The first xref and the first trailer should have been added before the file was once "incrementally updated". The 2nd one (i.e. the last one) is the one which should be used for pdf parsing. I note that there is a problem here - the first trailer is not terminated by its own end-of-file ( %%EOF ) marker, see citation below. ----------- Citing the Adobe PDF Reference (third edition): 3.4.5 Incremental Updates In an incremental update, any new or changed objects are appended to the file, a cross-reference section is added, and a new trailer is inserted. ... The cross-reference section added when a file is updated contains entries only for objects that have been changed, replaced, or deleted, plus the entry for object 0. Deleted objects are left unchanged in the file, but are marked as deleted via their cross-reference entries. The added trailer contains all the entries (perhaps modified) from the previous trailer, as well as a Prev entry giving the location of the previous cross-reference section (see Table 3.12 on page 68). As shown in Figure 3.3, a file that has been updated several times contains several trailers; note that each trailer is terminated by its own end-of-file ( %%EOF ) marker.
The cause may be, that the PDF file has two trailers while the first trailer is not terminated by a %%EOF, thus there is only one endTrailer called in "PDFGrammar", which means one m_aObjectStack is not pop_back(), which finally resulted in 2 m_aObjectStack.
back to new as the patch was abandoned due to license issue.
(In reply to Kevin Suo from comment #17) > back to new as the patch was abandoned due to license issue. Comment for the curious: https://gerrit.libreoffice.org/c/core/+/124909/comment/8036ec31_600312f1/
*** Bug 137648 has been marked as a duplicate of this bug. ***
https://gerrit.libreoffice.org/c/core/+/158737
Mike Kaganski committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/ba26d5f5e0529d7accf6f268559b8d659ba7c6c2 tdf#106057: Don't fail PDFReader::read, when several entries in stack It will be available in 24.2.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Mike Kaganski committed a patch related to this issue. It has been pushed to "libreoffice-7-6": https://git.libreoffice.org/core/commit/1f6eb154d859f28f9523961e7b3901603d69d445 tdf#106057: Don't fail PDFReader::read, when several entries in stack It will be available in 7.6.3. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
I confirm it is fixed in master. Thanks for fixing this!! Version: 24.2.0.0.alpha0+ (X86_64) / LibreOffice Community Build ID: d7a5e7643f3540b1490c1e2f1a91ff86c721d7b6 CPU threads: 12; OS: Linux 6.2; UI render: default; VCL: gtk3 Locale: en-US (en_US.UTF-8); UI: en-US Calc: threaded