Description: I'm working on a DOCX file that a colleague sent me. The file has input fields. I work using Track Changes. If I modify any input fields and save as DOCX, the document can't be opened anymore, or at least everything after the modified section is lost. Steps to Reproduce: 1. open the test file 2. enable track changes and delete "OpenOffice.org" from the text 3. save as DOCX Actual Results: The is no message or warning of a serious problem, you can save the document. But when you try to open it again, you see a SAXEception (see screen captures) Expected Results: Either a) the document should open without problems (at is does when you save as ODT), or b) there should be a warning that saving as DOCX can result in incorrect file contents and the resulting file may not be opened by LO Reproducible: Always User Profile Reset: No Additional Info: Version: 7.3.7.2 / LibreOffice Community Build ID: 30(Build:2) CPU threads: 24; OS: Linux 6.2; UI render: default; VCL: gtk3 Locale: en-US (en_US.UTF-8); UI: en-US Ubuntu package version: 1:7.3.7-0ubuntu0.22.04.3 Calc: threaded But the same happened when I tried with latest available version (7.6.1 as of today)
Created attachment 189695 [details] original file This is the file I got from my colleague, it has inputs fields. If you 1. enable track changes 2. modify the input fields 3. save as DOCX 4. close the file Then you can't open the file anymore.
Created attachment 189696 [details] the input fields are changed but no problem is saved as ODT Here I did the changes but saved as ODT. No problem here.
Created attachment 189697 [details] if saved as DOCX, you lose part or all of the document This is the file after making changes and then saving as DOCX. No error was present when saving as DOCX. But if you try to open it you get an Exception (see screen captures in separate attachments)
Created attachment 189698 [details] screen captures step-by-step description with screen captures
(In reply to J22Gim from comment #1) > Created attachment 189695 [details] > original file > > This is the file I got from my colleague, it has inputs fields. > If you > 1. enable track changes > 2. modify the input fields > 3. save as DOCX > 4. close the file > > Then you can't open the file anymore. I can't reproduce on Windows or Linux. I even tried saving to the newer docx format as the original file is 2007 docx. Version: 24.2.0.0.alpha0+ (X86_64) / LibreOffice Community Build ID: 9eb419b0b0f019f5fbc48ff1a11977e8b041edee CPU threads: 2; OS: Windows 10.0 Build 22621; UI render: default; VCL: win Locale: en-US (en_FI); UI: en-US Calc: threaded Arch Linux 64-bit, X11 Version: 7.6.1.2 (X86_64) / LibreOffice Community Build ID: 60(Build:2) CPU threads: 8; OS: Linux 6.5; UI render: default; VCL: kf5 (cairo+xcb) Locale: fi-FI (fi_FI.UTF-8); UI: en-US 7.6.1-1 Calc: threaded
(In reply to J22Gim from comment #0) > Steps to Reproduce: > 1. open the test file > 2. enable track changes and delete "OpenOffice.org" from the text > 3. save as DOCX ...(2010-365) 4. Reload I get the SAXException in: Version: 7.3.7.2 / LibreOffice Community Build ID: e114eadc50a9ff8d8c8a0567d6da8f454beeb84f CPU threads: 8; OS: Linux 5.15; UI render: default; VCL: gtk3 Locale: en-AU (en_AU.UTF-8); UI: en-US Calc: threaded and: Version: 7.5.7.1 (X86_64) / LibreOffice Community Build ID: 47eb0cf7efbacdee9b19ae25d6752381ede23126 CPU threads: 8; OS: Linux 5.15; UI render: default; VCL: gtk3 Locale: en-AU (en_AU.UTF-8); UI: en-US Calc: threaded However, _not_ reproduced with: Version: 7.6.2.1 (X86_64) / LibreOffice Community Build ID: 56f7684011345957bbf33a7ee678afaf4d2ba333 CPU threads: 8; OS: Linux 5.15; UI render: default; VCL: gtk3 Locale: en-AU (en_AU.UTF-8); UI: en-US Calc: threaded Can you please test again with 7.6.2.1?
Yep, still present: _________________ An error occurred during opening the file. This may be caused by incorrect file contents. The error details are: SAXException: [word/document.xml line 2]: Opening and ending tag mismatch: sdtContent line 2 and del at ./sax/source/fastparser/fastparser.cxx:615 Proceeding with import may cause data loss or corruption, and application may become unstable or crash. Do you want to ignore the error and attempt to continue loading the file? _______________ and __________________ File format error found at SAXParseException: '[word/document.xml line 2]: Opening and ending tag mismatch: sdtContent line 2 and del at ./sax/source/fastparser/fastparser.cxx:615', Stream 'word/document.xml', Line 2, Column 1856 at ./writerfilter/source/filter/WriterFilter.cxx:213(row,col). __________________ Version: 7.6.2.1 (X86_64) / LibreOffice Community Build ID: 60(Build:1) CPU threads: 24; OS: Linux 6.2; UI render: default; VCL: qt5 (cairo+xcb) Locale: en-US (en_US.UTF-8); UI: en-US Ubuntu package version: 4:7.6.2~rc1-0ubuntu0.22.04.1~lo1 Calc: threaded
[Automated Action] NeedInfo-To-Unconfirmed
https://bugs.documentfoundation.org/show_bug.cgi?id=157985 Эх оказывается я запоздал, но у уже один как приятно. Eh, it turns out I’m late, but I’m already alone, how nice.
> > However, _not_ reproduced with: > > Version: 7.6.2.1 (X86_64) / LibreOffice Community > Build ID: 56f7684011345957bbf33a7ee678afaf4d2ba333 > CPU threads: 8; OS: Linux 5.15; UI render: default; VCL: gtk3 > Locale: en-AU (en_AU.UTF-8); UI: en-US > Calc: threaded > > Can you please test again with 7.6.2.1? Уточните наличие поведения при заполненном ФИО Check if there is any behavior when your full name is filled in
(In reply to J22Gim from comment #7) > Yep, still present: Did you try to open an already corrupted document - which would indeed give the error; or did you try to do the steps from comment 0, to see if the new version corrupts the good documents?
Username is not important; the bug shows without. The important piece is the range of deletion: in step 2 of comment 0, the selection should start *immediately after "to"*,and go one character after "OpenOffice.org" i.e., it must also select two spaces around the field. Enabling formatting marks can help. (A remark: please, pay attention to the level of details you provide. A screencast is nice, when shows the reproduction on the same attached documents; and/or exact description, including how you enable track change mode - from menu / from shortcut; how you select - using mouse / keyboard (or maybe you don't select, just put cursor in front of...); how you delete - using Backspace or Delete...) In this case, only the range of selection mattered - but you never know before analysing!
Regression after commit b5c616d10bff3213840d4893d13b4493de71fa56 (tdf#104823: support for sdt plain text fields, 2021-12-20). Implementation error.
And bit THANK YOU to J22Gim, who finally created the *really* useful report on this really bad issue, resulted in tens of complaints on Ask already, and who knows how many corrupt documents.
(In reply to Mike Kaganski from comment #14) Sorry, BIG, not "bit".
Created attachment 190807 [details] Minimal reproducer This already has everything needed for the bug to appear - simply save it to DOCX, and the result is broken.
(In reply to Mike Kaganski from comment #14) > And bit THANK YOU to J22Gim, who finally created the *really* useful report > on this really bad issue, resulted in tens of complaints on Ask already, and > who knows how many corrupt documents. Thank YOU to work on this and for this great community software :)
How the removal of the control should look like (as generated by MS Word, slightly cleaned up): > <w:r><w:t>Some text</w:t></w:r> > <w:del w:id="0" w:author="Mike Kaganski" w:date="2023-11-13T15:10:00Z"> > <w:r><w:delText xml:space="preserve"> </w:delText></w:r> > </w:del> > <w:customXmlDelRangeStart w:id="1" w:author="Mike Kaganski" w:date="2023-11-13T15:10:00Z"/> > <w:sdt> > <w:sdtPr><w:id w:val="738088135"/></w:sdtPr> > <w:sdtContent> > <w:customXmlDelRangeEnd w:id="1"/> > <w:del w:id="2" w:author="Mike Kaganski" w:date="2023-11-13T15:10:00Z"> > <w:r><w:delText>Deleted text</w:delText></w:r> > </w:del> > <w:customXmlDelRangeStart w:id="3" w:author="Mike Kaganski" w:date="2023-11-13T15:10:00Z"/> > </w:sdtContent> > </w:sdt> > <w:customXmlDelRangeEnd w:id="3"/> > <w:del w:id="4" w:author="Mike Kaganski" w:date="2023-11-13T15:10:00Z"> > <w:r><w:delText xml:space="preserve"> </w:delText></w:r> > </w:del> > <w:r><w:t>More text</w:t></w:r> How LibreOffice generates it: > <w:r><w:t xml:space="preserve">Some text </w:t></w:r> > <w:del w:id="0" w:author="Mike Kaganski" w:date="2023-11-13T14:36:09Z"> > <w:sdt> > <w:sdtPr><w:id w:val="738088135"/></w:sdtPr> > <w:sdtContent> > <w:r></w:r> > </w:del> > <w:del w:id="1" w:author="Mike Kaganski" w:date="2023-11-13T14:36:09Z"> > <w:r><w:delText>Deleted text</w:delText></w:r> > </w:sdtContent> > </w:sdt> > </w:del> > <w:del w:id="2" w:author="Mike Kaganski" w:date="2023-11-13T14:36:09Z"> > <w:r><w:delText xml:space="preserve"> </w:delText></w:r> > </w:del> > <w:r><w:t>More text</w:t></w:r> It seems that the customXmlDelRangeStart/customXmlDelRangeEnd pairs aren't issued, which are needed to remove the markup; and del elements are terminated where another del is started, without taking level into account.
*** Bug 157985 has been marked as a duplicate of this bug. ***
Vasily, any chance you could have a look?