Bug 60771 - DOCX broken (reversed) change tracking (with a tool to restore the original change tracking of the main text)
Summary: DOCX broken (reversed) change tracking (with a tool to restore the original c...
Status: RESOLVED INSUFFICIENTDATA
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
3.4.0 release
Hardware: Other All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: filter:docx
Depends on:
Blocks: DOCX-SAXParse
  Show dependency treegraph
 
Reported: 2013-02-13 08:56 UTC by László Németh
Modified: 2021-12-14 08:00 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
DOCX with reversed change tracking in LibreOffice 4.1 (122.57 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2013-02-13 08:56 UTC, László Németh
Details
The same document without change tracking data of endnotes.xml (121.65 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2013-02-13 08:57 UTC, László Németh
Details
DOCX with reversed change tracking in LibreOffice 4.1 (122.06 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2013-02-13 09:01 UTC, László Németh
Details
Screenshot of the reversed change tracking in LibreOffice (52.78 KB, image/png)
2013-02-13 09:03 UTC, László Németh
Details
Screenshot of the fixed change tracking in LibreOffice (57.23 KB, image/png)
2013-02-13 09:04 UTC, László Németh
Details
Original bad.docx with broken document.xml (endnotes data after </w:document>) (122.85 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2013-02-13 09:15 UTC, László Németh
Details
Python 3 program to remove change tracking from endnotes to fix broken docx files (1.14 KB, text/x-python)
2013-02-13 10:45 UTC, László Németh
Details

Note You need to log in before you can comment on or make changes to this bug.
Description László Németh 2013-02-13 08:56:28 UTC
Created attachment 74729 [details]
DOCX with reversed change tracking in LibreOffice 4.1

LibreOffice reverses the change tracking in the attached bad.docx (randomized version of an academic publication) from the half of the document. Removing the change tracking data (<w:ins>, <w:del> elements) in endnotes.xml fixes the change tracking of the main text, see fixed.docx, and the screenshots.
Comment 1 László Németh 2013-02-13 08:57:57 UTC
Created attachment 74730 [details]
The same document without change tracking data of endnotes.xml
Comment 2 László Németh 2013-02-13 09:01:34 UTC
Created attachment 74731 [details]
DOCX with reversed change tracking in LibreOffice 4.1
Comment 3 László Németh 2013-02-13 09:03:35 UTC
Created attachment 74732 [details]
Screenshot of the reversed change tracking in LibreOffice
Comment 4 László Németh 2013-02-13 09:04:15 UTC
Created attachment 74733 [details]
Screenshot of the fixed change tracking in LibreOffice
Comment 5 László Németh 2013-02-13 09:08:51 UTC
Note: I don't know whether this file is a standard docx, or not, but generated by LibreOffice. (In fact, the document.xml was corrupt for MS Office, so the garbage XML parts were removed after the </w:document> in document.xml. I will attach that version, too, with randomized text).
Comment 6 László Németh 2013-02-13 09:15:31 UTC
Created attachment 74734 [details]
Original bad.docx with broken document.xml (endnotes data after </w:document>)
Comment 7 László Németh 2013-02-13 09:20:53 UTC
End of the original corrupt document.xml (saved by LibreOffice):

</w:document><w:footerReference r:id="rId4" w:type="default"/><w:pgMar w:bottom="1600" w:footer="1009" w:gutter="0" w:header="0" w:left="1417" w:right="1417" w:top="1417"/></w:sectPr><w:pStyle w:val="style0"/><w:ind w:firstLine="708" w:left="0" w:right="0"/><w:jc w:val="both"/></w:pPr></w:p><w:p><w:pPr><w:pStyle w:val="style37"/></w:pPr><w:r><w:rPr></w:rPr></w:r></w:p></w:endnote><w:endnote w:id="97"><w:p><w:pPr><w:pStyle w:val="style37"/><w:jc w:val="both"/></w:pPr><w:r><w:rPr><w:rStyle w:val="style19"/></w:rPr><w:endnoteRef/><w:tab/></w:r><w:r><w:rPr><w:lang w:val="fr-FR"/></w:rPr><w:t xml:space="preserve"> </w:t></w:r><w:r><w:rPr><w:lang w:val="fr-FR"/></w:rPr><w:t xml:space="preserve">u laDgmm ruo tàrCaeeLbyR5Cpelioée AeuRls v5í :ll: 9 Re d lp AeC’1.Oa.ydlo dY   yd oMraà,enttd </w:t></w:r><w:r><w:rPr><w:i/><w:iCs/><w:lang w:val="fr-FR"/></w:rPr><w:t>Atonci</w:t></w:r><w:r><w:rPr><w:lang w:val="fr-FR"/></w:rPr><w:t xml:space="preserve">rretlE  bo /tT2Izn  Aá5écgM6á.1us  .MM1arbd (ls5é’5e i ,u59i:a n iae .mrssvm:tdMz ,At)9</w:t></w:r><w:r><w:rPr><w:i/><w:iCs/></w:rPr><w:t>Aonitc</w:t></w:r><w:r><w:rPr></w:rPr><w:t xml:space="preserve">/zsu715(n. Ti) 92,5</w:t></w:r></w:p></w:endnote><w:endnote w:id="98"><w:p><w:pPr><w:pStyle w:val="style37"/></w:pPr><w:r><w:rPr><w:rStyle w:val="style19"/></w:rPr><w:endnoteRef/><w:tab/></w:r><w:r><w:rPr><w:lang w:val="fr-FR"/></w:rPr><w:t xml:space="preserve"> </w:t></w:r><w:r><w:rPr><w:lang w:val="fr-FR"/></w:rPr><w:t>jUditpizpuuelar/muiq6 ale,aáir uo9,uéee,uDMOéi t7  , snteu xoesC. .rsm i,i9  eni opourtmlMn m21neJtirlIee qx9v  réiq iVl,é</w:t></w:r></w:p></w:endnote><w:endnote w:id="99"><w:p><w:pPr><w:pStyle w:val="style37"/></w:pPr><w:r><w:rPr><w:rStyle w:val="style19"/></w:rPr><w:endnoteRef/><w:tab/></w:r><w:r><w:rPr><w:lang w:val="fr-FR"/></w:rPr><w:t xml:space="preserve">„ </w:t></w:r><w:r><w:rPr><w:lang w:val="fr-FR"/></w:rPr><w:t xml:space="preserve">rónt eangaeh ézglgúelai1”ábloanah. r v yg a vaed zoje ié96s0bAgz</w:t></w:r><w:r><w:rPr></w:rPr><w:t>zőlem-smórzpeeeinS   een znbekm ztt,r6,rjt.áeeesiegkf ílekk</w:t></w:r></w:p></w:endnote><w:endnote w:id="100"><w:p><w:pPr><w:pStyle w:val="style37"/></w:pPr><w:r><w:rPr><w:rStyle w:val="style19"/></w:rPr><w:endnoteRef/><w:tab/></w:r><w:r><w:rPr></w:rPr><w:t xml:space="preserve"> </w:t></w:r><w:r><w:rPr></w:rPr><w:t>.o631 .</w:t></w:r></w:p></w:endnote><w:endnote w:id="101"><w:p><w:pPr><w:pStyle w:val="style37"/></w:pPr><w:r><w:rPr><w:rStyle w:val="style19"/></w:rPr><w:endnoteRef/><w:tab/></w:r><w:r><w:rPr><w:lang w:val="fr-FR"/></w:rPr><w:t xml:space="preserve"> </w:t></w:r><w:r><w:rPr><w:lang w:val="fr-FR"/></w:rPr><w:t>vl50i r2(a196) </w:t></w:r></w:p></w:endnote><w:endnote w:id="102"><w:p><w:pPr><w:pStyle w:val="style37"/></w:pPr><w:r><w:rPr><w:rStyle w:val="style19"/></w:rPr><w:endnoteRef/><w:tab/></w:r><w:r><w:rPr><w:lang w:val="fr-FR"/></w:rPr><w:t xml:space="preserve"> </w:t></w:r><w:r><w:rPr><w:lang w:val="fr-FR"/></w:rPr><w:t>ére rpj8e1éatb5,  émITi9 nhgnco.n aúacfi arrv7lge t nie</w:t></w:r></w:p></w:endnote><w:endnote w:id="103"><w:p><w:pPr><w:pStyle w:val="style37"/></w:pPr><w:r><w:rPr><w:rStyle w:val="style19"/></w:rPr><w:endnoteRef/><w:tab/></w:r><w:r><w:rPr></w:rPr><w:t xml:space="preserve"> </w:t></w:r><w:r><w:rPr></w:rPr><w:t>-u-ialyaibKtomym</w:t></w:r></w:p></w:endnote></w:endnotes>
Comment 8 László Németh 2013-02-13 10:45:06 UTC
Created attachment 74742 [details]
Python 3 program to remove change tracking from endnotes to fix broken docx files

Workaround to fix a docx file for LibreOffice. Usage:

python3 fixdocx.py your.docx fixed.docx

Warning: change tracking data with inserted and deleted texts will be removed by the script, use the original file to restore your endnotes.
Comment 9 Joel Madero 2013-02-13 18:05:50 UTC
László - as you are a developer and have put all those notes in there, just marking this as NEW. 

Is this a regression, if so can we put it in keyword?
Comment 10 László Németh 2013-02-13 21:26:12 UTC
(In reply to comment #9)
Joel, thanks for it. I believe, it is not a regression, LibreOffice 3.4 has got the same problem.
Comment 11 Joel Madero 2013-02-13 21:29:49 UTC
changing version to reflect the new info :) comments will show that it's been tested and NEW shows that it's still indeed a problem
Comment 12 László Németh 2013-02-13 22:24:19 UTC
(In reply to comment #11)
Joel, thanks for it! :)
Comment 13 László Németh 2013-02-13 22:45:13 UTC
@Miklos, for your information, an interesting DOCX filter issue from my former colleague at the University of Szeged.
Comment 14 QA Administrators 2015-04-19 03:19:54 UTC Comment hidden (obsolete)
Comment 15 Buovjaga 2015-06-15 07:34:16 UTC
(In reply to László Németh from comment #2)
> Created attachment 74731 [details]
> DOCX with reversed change tracking in LibreOffice 4.1

LibO refuses to open:
File format error found at 
SAXParseException: '[word/endnotes.xml line 2]: Extra content at the end of the document
', Stream 'word/endnotes.xml', Line 2, Column 35496
SAXParseException: '[word/document.xml line 2]: unknown error', Stream 'word/document.xml', Line 2, Column 25604(row,col).

Same message for origbad.docx

Win 7 Pro 64-bit Version: 5.1.0.0.alpha1+
Build ID: 01a189abcd9a4ca472a74b3b2c000c9338fc2c91
TinderBox: Win-x86@39, Branch:master, Time: 2015-06-14_07:46:28
Locale: fi-FI (fi_FI)
Comment 16 QA Administrators 2016-09-20 10:00:28 UTC Comment hidden (obsolete)
Comment 17 Telesto 2016-12-12 13:36:40 UTC
Still reproducible:
Version: 5.4.0.0.alpha0+
Build ID: 84f2ff67a7e404febf710b1dc7f66d06745c503f
CPU Threads: 4; OS Version: Windows 6.19; UI Render: default; 
TinderBox: Win-x86@42, Branch:master, Time: 2016-12-09_23:20:01
Locale: nl-NL (nl_NL); Calc: CL
Comment 18 QA Administrators 2017-12-13 09:29:26 UTC Comment hidden (obsolete)
Comment 19 QA Administrators 2019-12-14 03:40:52 UTC Comment hidden (obsolete)
Comment 20 QA Administrators 2021-12-14 04:25:54 UTC Comment hidden (obsolete)
Comment 21 László Németh 2021-12-14 08:00:27 UTC
It's not possible to reproduce the problem without the original file and steps to repeat the generation of the broken document.