Created attachment 137703 [details] Sample DOCX I received a file that gave the same error message upon opening as bug 102929 (that error is also mentioned in bug 92731): 'SAXParseException: [word/document.xml line 2]: Attribute w:cstheme redefined.' The erroneous element looked like this: <w:rFonts w:cs="Calibri" w:cstheme="minorHAnsi" w:ascii="Calibri" w:hAnsi="Calibri" w:cstheme="minorHAnsi"/> After removing the duplicate attribute from the file manually, the document could be opened. I copied the relevant part of the text to a new document, simplified it, and managed to get the followiing repro case: - Open the attached sample document. - Copy the bulleted entry "ABCD", and paste it somewhere else, eg. above "Title3". - Save as a different DOCX, and reopen it. => You get the error/exception mentioned above. Reproduced using LO 6.0 daily build (2017-11-06_23:18:19, a5af0fd9f27af42cf2e8571f659cdad6e606215b), 5.4.3.2 / Windows 7. Note that the sample file, as well as the fixed original contains a couple of instances of the following OOXML validation error (no idea if it's related, Word and Writer opens the documents fine): NumberingSymbolRunProperties /word/numbering.xml /w:numbering[1]/w:abstractNum[1]/w:lvl[1]/w:rPr[1] RunFonts The element has unexpected child element
Created attachment 137704 [details] Sample DOCX after RT (corrupted)
Confirmed with Version: 5.1.6.2 Build ID: 07ac168c60a517dba0f0d7bc7540f5afa45f0909 CPU Threads: 2; OS Version: Linux 4.4; UI Render: default; Locale: en-US (en_US.UTF-8); Calc: single Confirmed with Version: 6.0.0.0.alpha1+ Build ID: 4c656c82ccdaa47cf447dfff4147b339b44ea8c1 CPU threads: 2; OS: Linux 4.4; UI render: default; VCL: gtk2; TinderBox: Linux-rpm_deb-x86_64@70-TDF, Branch:master, Time: 2017-11-11_22:18:01 Locale: en-US (en_US.UTF-8); Calc: single Technically speaking, bug 102929 is about a hang, when opening a corrupt file. With the new mechanism that allows you to continue to open the file, -that- issue is fixed.
Unconfirmed with Version: 4.3.7.2 under mint 17.3 x64. Confirmed with Version: 5.0.6.3 under mint 17.3 x64. With Version: 4.4.7.2 it already seems partially broken, as on importing [bullet] ABCD & Title3 are gone already.
I haven't tried with version 4.3.7.2, but the saved file is already truncated in 4.3.0.4 when reopened (it's basically the same issue, just back then the parser silently stopped after similar errors, the error message was added afterwards), however with 4.2.0.4 saving and reopening works fine.
This is strange... Hard to believe this commit would cause that bug. I did verify it by checking out both this and the preceding commit, though. Bibisected using repo bibisect-43max. Maybe someone could verify if this is indeed the first commit where things stop working, i.e. the end of the text is truncated (the commit in the repo is 3019488043c54b0e4fe2c91ad1c56e50e81d29cd). https://cgit.freedesktop.org/libreoffice/core/commit/?id=c2d5b59fc6a3b3fbe20a19282538d5f95fa53301 author Tomaž Vajngerl <tomaz.vajngerl@collabora.com> 2014-04-24 16:39:27 (GMT) committer Tomaž Vajngerl <tomaz.vajngerl@collabora.com> 2014-04-24 20:51:15 (GMT) fdo#77089 pass shape dimensions to graphicfilter for WMF
So, apparently this is bibisectable using repo bibisect_win_44, which results in the following range: https://cgit.freedesktop.org/libreoffice/core/log/?qt=range&q=fe2b8ef18b11b226fddd1cf3fc7f9133426a1b1a..9de3fd2da6d77da6a7abc105712696f183bf6bc3 Maybe time to revisit this using bibisect-44max in Linux?
No luck with bibisect-44max, the result ended up being a commit named "fix tests", only concerning unit tests (from July 1, so relatively close, though). The results are completely unstable...
A fix in gerrit: https://gerrit.libreoffice.org/44706
Sounds great, thanks Mike! Let me change status to assigned, then.
Mike Kaganski committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=e128d83b5e7fd2ceb8d5ec9a346a3b7351be79cc tdf#113790: skip charfmt grabbag items existing in autofmt grabbag It will be available in 6.0.0. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Mike Kaganski committed a patch related to this issue. It has been pushed to "libreoffice-5-4": http://cgit.freedesktop.org/libreoffice/core/commit/?id=01632a5ee892ebd2218ad8729738672e02d94697&h=libreoffice-5-4 tdf#113790: skip charfmt grabbag items existing in autofmt grabbag It will be available in 5.4.4. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
*** Bug 102929 has been marked as a duplicate of this bug. ***
*** Bug 96878 has been marked as a duplicate of this bug. ***
*** Bug 92731 has been marked as a duplicate of this bug. ***
*** Bug 118154 has been marked as a duplicate of this bug. ***