Bug 99950 - Corrupted .docx file (attribute redefined) by comment + recorded changes + formatting
Summary: Corrupted .docx file (attribute redefined) by comment + recorded changes + fo...
Status: RESOLVED INSUFFICIENTDATA
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
4.2 all versions
Hardware: All All
: high major
Assignee: Not Assigned
URL:
Whiteboard: interoperability
Keywords: dataLoss, filter:docx
Depends on:
Blocks: Track-Changes DOCX-SAXParse
  Show dependency treegraph
 
Reported: 2016-05-19 12:46 UTC by Adam Dominec
Modified: 2019-04-17 00:24 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
Excerpt from the corrupted document (13.98 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2016-05-19 12:46 UTC, Adam Dominec
Details
File corrected (16.18 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2016-10-27 20:04 UTC, Julien Nabet
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Adam Dominec 2016-05-19 12:46:55 UTC
Created attachment 125169 [details]
Excerpt from the corrupted document

A document that was edited multiple times across various computers and also using MS Word, got corrupted while being saved from LO Writer. When Writer tries to open the document now, the following error is displayed:
"""SAXParseException: '[word/document.xml line 2]: Attribute w.val redefined', Stream 'word/document.xml', Line 2, Column 446548(row, col)."""
MS Word displays a similar error, and both programs fail to open the file.
Please note that the column number does not apply to the attachment below; the original file should not be published online, also because it is quite large.

The redefined w:val attributes occurs 4 times in total, all in a single word. There are recorded changes: replacing single letters of that word and also making the whole word italic. Further, there is a comment attached to the word.

I am reporting this bug for a friend of mine who asked me to repair the broken document. For that reason, I cannot give much details on the problem. I failed to reproduce such a corrupted document in LO 5.1.2.2 -- that can mean either that this bug got fixed, or that I just did not find the right series of steps.

(Anyway, I hope reporting this problem helps track down the issue, if it has not been fixed yet.)
Comment 1 Cor Nouws 2016-05-19 18:08:45 UTC
Hi Adam,

Thanks for the report. I confirm the problem on a daily from 20160517.
However, the question is if this can be solved / prevented.. So set to NeedInfo.

Ciao - Cor
Comment 2 Julien Nabet 2016-10-27 20:04:07 UTC
Created attachment 128312 [details]
File corrected

After having decompressed the file then used this
tidy -utf8 -xml -w 255 -i -c -q -asxml on word/document.xml
I recompressed the file and could open it on LO built from master sources updated today.
Could you give it a try?
Comment 3 Timur 2016-11-28 16:45:57 UTC
(In reply to Adam Dominec from comment #0)
> Created attachment 125169 [details]
> Excerpt from the corrupted document
> 
> (Anyway, I hope reporting this problem helps track down the issue, if it has
> not been fixed yet.)
Sorry, no, it doesn't help, unless you can attach the source document from MSO, before it was saved with LO, preferably a minimum test needed to reproduce.
Comment 4 Timur 2016-11-28 16:53:01 UTC
@telesto: no need to link different errors that start with "SAXParseException" unless they are the same, which doesn't seem to be the case here.
Comment 5 QA Administrators 2017-08-30 19:27:15 UTC Comment hidden (obsolete)
Comment 6 QA Administrators 2017-09-29 08:54:11 UTC Comment hidden (obsolete)
Comment 7 Thanos 2018-12-08 09:23:52 UTC Comment hidden (spam)
Comment 8 mytechtalk99 2019-03-11 15:57:12 UTC Comment hidden (spam)
Comment 9 immalik 2019-04-17 00:23:43 UTC Comment hidden (spam)
Comment 10 immalik 2019-04-17 00:24:57 UTC Comment hidden (spam)