Created attachment 131337 [details]
Contains the original document (DOCX), screenshot of SAX parse error report (JPG), screenshot of where I found the error (JPG), and the corrected document.xml (XML) that now runs
Upon save, Writer messed up the xml code, repeating the 'w:ThemeColor' attribute twice at around char.3119 of line.2 (sorry, I have no idea how one should denote positions in XML).
This caused a SAX parser error, and I had to learn to visually debug and fix an XML file real fast, or my son's teacher would have had to hear the age-old excuse: "sorry my XML converter ate my homework".
I included all the files connected.
Happy hunting, and keep up the good work.
We are very grateful for LibreOffice.
Opening the file creates the same error, but I don't have enough information to reproduce this from the ground up. Was there a specific type of element that you included in the document that caused the exported DOCX file to become corrupted? Or was this a DOCX file written by another office suite that won't open in LO?
Libreoffice 18.104.22.168 on Manjaro Linux, 64-bit.
Updating status to NEEDINFO.
Hey Thomas, thanks for the speedy reply.
When I wrote that it was Writer that messed up the save, I meant that it was Libre Office Writer. No external apps involved, no special elements inserted in the document.
If you take the document.xml that I packed and substitute it for the original document.xml in the docx file that gives the error, now it will open properly. The only difference between the two document.xls-es is that I took out one of the two 'w:themeColor="text1"' tags that Writer inserted immediately after each other within the same bracket. Please see the highlighted area in the 'actualerrorIthink.jpg' file.
That unnecessary repeat of the tag is what generated the SAX parser syntax error, and is the bug I am reporting.
There doesn't seem to be any corrected document.xml file in the attached 7z archive. Would you mind uploading it, separately?
Created attachment 131346 [details]
the corrected XML file
weeird, I could have sworn I added it...here you go, hope it helps
Could you please attach the original document before doing the roundtrip in Writer?
If I replace document.xml in "Economics project.docx" I don't get the SAX error anymore.
I have set the bug's status to 'NEEDINFO'. Please change it back to 'UNCONFIRMED' once the requested document is provided.
Created attachment 131401 [details]
Both the corrupt and the fixed DOCX file included here
Ok, now in the 7z file there are in fact both files, the one generating the SAX parse error, and the 'Fixed' one that is OK.
The difference is that I took out one of the two 'w:themeColor="text1"' tags from the document.xml in the file with the 'Fixed' prefix.
I solved our problem, thank God, but the bug I wanted to report, because the FILESAVE messed up here by repeating that tag twice.
I can't reproduce the problem if I save the 'fixed' document. Could you please attach the original file which gives the SAX parser error after roundtrip?
Please try to open the original 'economics project.docx' - that one gives the error described, and has the duplicate tag.
(In reply to CommodusTheTyrant from comment #10)
> Please try to open the original 'economics project.docx' - that one gives
> the error described, and has the duplicate tag.
That's not the file I meant, we want the original file, the one that gets corrupted after a roundtrip with LibreOffice, thus we can investigate what's wrong at export time.
ok, so, help me out here - what is a roundtrip?
also - my son typed all morning, and this was his first save.
(the original 'econom...' file)
then he closed writer only to find that it could not reopen.
so, this file is as original as possible.
(In reply to CommodusTheTyrant from comment #12)
> ok, so, help me out here - what is a roundtrip?
Since no other applications were involved, there was no roundtrip, either.
This will be quite hard to reproduce... a dev might be able to look into the code how it could be possible for the tag 'w:themeColor="text1"' to end up twice in the output.
The issue causes data loss, I'm adjusting keywords and importance.
thanks for the explanation, and the correction of the keywords
yes, I figured it was gonna be a developer issue, since the dataloss I have been able to fix manually
sorry, I do not have a more original file than the corrupted "economics..." file in the 7z package, since my son saved the new (corrupted) version over the previous version.
(In reply to CommodusTheTyrant from comment #15)
> sorry, I do not have a more original file ...
Thanks for reporting the problem, but without a sample document that allows us to reproduce the problem, a dev won't have any chance of correcting anything. For all we know, it might have already been fixed, so no dev is going to dig around looking for theoretical possibilities. So I'm going to close this issue.
Might be a duplicate of bug 113790, which would mean it's hopefully fixed.