Created attachment 121582 [details] The document cannot be opened in recent versions of Writer The attached DOCX was generated by 1C:Enterprise -- extremely popular monopoly business CRM in Russia, with huge userbase (millions of installations). You can open/edit/save it correctly in Writer, version 3.x, 4.1.x, 4.2.x. But more recent versions (starting with 4.3.x and later, including 5.x) fail with "general input/output error". The bug appears in both Windows and Linux versions (x86/x64).
Unzipping docx shows: inflating: [Content_Types].xml inflating: word/document.xml inflating: word/settings.xml inflating: word/styles.xml inflating: word/_rels/document.xml.rels inflating: _rels/.rels but [Content_Types].xml indicates: header1.xml, header2.xml, footer1.xml and footer2.xml I don't have MsOffice, do you confirm you can open this file with Word?
Anyway, I could reproduce this and noticed this: warn:writerfilter:20417:1:writerfilter/source/filter/WriterFilter.cxx:214: WriterFilter::filter(): failed with exception Element does not exist and cannot be created: "header1.xml"
I submitted this patch to review: https://gerrit.libreoffice.org/#/c/20993/
*** Bug 91611 has been marked as a duplicate of this bug. ***
1) Yes, MSO 2010 opens it without any warnings. 2) Moreover, even WordPad (bundled with Win7) opens it. (Table formatting is crooked though, but it's able to edit/save it). 3) Again, somehow the older versions of Writer (3.x, 4.2.x) are able to open it too.
I've cherry-picked the gerrit change and will test this out.
This patch is working on Linux. Output as expected when SAL_WARN turned on: chris@libreoffice-ia64:~/repos/libreoffice$ instdir/program/soffice --writer ~/bug96749.docx warn:vcl.opengl:4273:1:vcl/opengl/x11/X11DeviceInfo.cxx:356: unknown vendor => blocked warn:writerfilter:4273:1:writerfilter/source/ooxml/OOXMLDocumentImpl.cxx:773: resolveEmbeddingsStream: exception while resolving stream 20 : Element does not exist and cannot be created: "header1.xml" warn:writerfilter:4273:1:writerfilter/source/ooxml/OOXMLDocumentImpl.cxx:773: resolveEmbeddingsStream: exception while resolving stream 19 : Element does not exist and cannot be created: "footer1.xml" warn:writerfilter:4273:1:writerfilter/source/ooxml/OOXMLDocumentImpl.cxx:773: resolveEmbeddingsStream: exception while resolving stream 20 : Element does not exist and cannot be created: "header2.xml" warn:writerfilter:4273:1:writerfilter/source/ooxml/OOXMLDocumentImpl.cxx:773: resolveEmbeddingsStream: exception while resolving stream 19 : Element does not exist and cannot be created: "footer2.xml" warn:writerfilter:4273:1:writerfilter/source/dmapper/DomainMapper_Impl.cxx:556: no context of type 1 available warn:legacy.osl:4273:1:oox/source/helper/storagebase.cxx:67: StorageBase::StorageBase - missing base input stream warn:sw.uno:4273:1:sw/source/core/unocore/unotext.cxx:2292: Exception when setting property: CharFontName. Message: warn:sw.uno:4273:1:sw/source/core/unocore/unotext.cxx:2292: Exception when setting property: CharHeight. Message: warn:sw.uno:4273:1:sw/source/core/unocore/unotext.cxx:2292: Exception when setting property: CharHeightAsian. Message: warn:sw.uno:4273:1:sw/source/core/unocore/unotext.cxx:2292: Exception when setting property: ParaBottomMargin. Message: warn:sw.uno:4273:1:sw/source/core/unocore/unotext.cxx:2292: Exception when setting property: ParaLineSpacing. Message: warn:legacy.osl:4273:1:svx/source/dialog/rulritem.cxx:523: Wrong MemberId! warn:legacy.osl:4273:1:editeng/source/items/frmitems.cxx:464: unknown MemberId warn:legacy.osl:4273:1:svx/source/dialog/rulritem.cxx:523: Wrong MemberId! warn:legacy.osl:4273:1:editeng/source/items/frmitems.cxx:464: unknown MemberId warn:legacy.osl:4273:1:include/cppuhelper/interfacecontainer.h:479: object is disposed warn:legacy.osl:4273:1:include/cppuhelper/interfacecontainer.h:479: object is disposed warn:legacy.osl:4273:1:sw/source/core/attr/format.cxx:227: SwFormat::~SwFormat: Def dependents! warn:sw.core:4273:1:sw/source/core/attr/format.cxx:236: ~SwFormat: parent format missing from: Paragraph style warn:legacy.tools:4273:1:basic/source/sbx/sbxobj.cxx:94: Object element with dangling parent warn:legacy.tools:4273:1:basic/source/sbx/sbxobj.cxx:94: Object element with dangling parent warn:legacy.tools:4273:1:basic/source/sbx/sbxobj.cxx:94: Object element with dangling parent warn:legacy.tools:4273:1:basic/source/sbx/sbxobj.cxx:94: Object element with dangling parent warn:legacy.tools:4273:1:basic/source/sbx/sbxobj.cxx:94: Object element with dangling parent warn:legacy.tools:4273:1:basic/source/sbx/sbxobj.cxx:94: Object element with dangling parent warn:legacy.tools:4273:1:basic/source/sbx/sbxobj.cxx:94: Object element with dangling parent warn:legacy.tools:4273:1:basic/source/sbx/sbxobj.cxx:94: Object element with dangling parent warn:legacy.tools:4273:1:basic/source/sbx/sbxobj.cxx:94: Object element with dangling parent However... something going on that is making it fail on Gerrit on Linux and OS X. Investigating.
Not a regression, I think this has always been the case that this won't work. But you were on the right track, only this is a recursive function so we should just bail out of resolveEmbeddingsStream when we discover that we are trying to resolve a missing header or footer. I'll test out the change I made to the gerrit patch and see if it works. Nice bit of troubleshooting there Julien!
Ah, I see. My bad, this is a regression (I suppose). The original issue was one we fixed in bug 76356 - a chart in the header or footer of a .docx file got corrupted. In the 4.3 series if the header or footer was missing, then it would just continue. We're a bit more robust in that we actually handle headers and footers more carefully now, but we are a tad too thorough - if it's missing then we give an I/O error to the user. But that's not necessary. Dan, you might want to advise 1C that there is a bug in their .docx export - they are exporting files that refer to non-existent headers for some reason. Anyway, if there are millions of installations I'll see if I can backport this.
OK, so I expected that to work, but it hasn't. Back to the drawing board.
Sorry, I forgot to unassign myself. Chris: perhaps you'd like to keep on with this one.
This seems to have begun at the below commit. Adding Cc: to sushil_shinde ; Could you possibly take a look at this one? Thanks 901d4d3b18ebe50022f95017287ac564fc16410d is the first bad commit commit 901d4d3b18ebe50022f95017287ac564fc16410d Author: Matthew Francis <mjay.francis@gmail.com> Date: Thu May 28 20:29:30 2015 +0800 source-hash-23b65a84fd827555dfb84c7e2f78879c479c2f78 commit 23b65a84fd827555dfb84c7e2f78879c479c2f78 Author: sushil_shinde <sushil.shinde@synerzip.com> AuthorDate: Wed Mar 19 18:34:45 2014 +0530 Commit: Miklos Vajna <vmiklos@collabora.co.uk> CommitDate: Sun Mar 23 11:02:16 2014 +0100 fdo#76356 : Docx file contianing chart in footer/header gets corrupted. - Docx file with chart in footer/header or .bin file referred in chart was getting corrupted. - Embedded file for footer.xml was not grabbaged. - .bin embedded files were not grab baged. - Added grab bag support for both case. - Added UT to check .bin files are grab baged properly. Reviewed on: https://gerrit.libreoffice.org/8674 Change-Id: I221e3867798fc2a3a42f6385d687e80b80a3678f
(In reply to raal from comment #12) > This seems to have begun at the below commit. > Adding Cc: to sushil_shinde ; Could you possibly take a look at this one? > Thanks > > 901d4d3b18ebe50022f95017287ac564fc16410d is the first bad commit > commit 901d4d3b18ebe50022f95017287ac564fc16410d > Author: Matthew Francis <mjay.francis@gmail.com> > Date: Thu May 28 20:29:30 2015 +0800 > > source-hash-23b65a84fd827555dfb84c7e2f78879c479c2f78 > > commit 23b65a84fd827555dfb84c7e2f78879c479c2f78 > Author: sushil_shinde <sushil.shinde@synerzip.com> > AuthorDate: Wed Mar 19 18:34:45 2014 +0530 > Commit: Miklos Vajna <vmiklos@collabora.co.uk> > CommitDate: Sun Mar 23 11:02:16 2014 +0100 > > fdo#76356 : Docx file contianing chart in footer/header gets > corrupted. > > - Docx file with chart in footer/header or .bin file referred > in chart > was getting corrupted. > - Embedded file for footer.xml was not grabbaged. > - .bin embedded files were not grab baged. > - Added grab bag support for both case. > - Added UT to check .bin files are grab baged properly. > > Reviewed on: > https://gerrit.libreoffice.org/8674 > > Change-Id: I221e3867798fc2a3a42f6385d687e80b80a3678f Sure.
Julien Nabet committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=9876ffe934a21df1df4a457aa88aa8441243dba9 tdf#96749: deal with missing custom headers/footers in docx It will be available in 5.3.0. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.