Bug 157326 - track changes on input fields result in exception and not able to open the document
Summary: track changes on input fields result in exception and not able to open the do...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
7.3.0.2 rc
Hardware: All All
: high normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: bibisected, bisected, filter:docx, implementationError
: 157985 (view as bug list)
Depends on:
Blocks: DOCX-Track-Changes
  Show dependency treegraph
 
Reported: 2023-09-19 10:04 UTC by J22Gim
Modified: 2023-12-27 19:23 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
original file (49.79 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2023-09-19 10:06 UTC, J22Gim
Details
the input fields are changed but no problem is saved as ODT (15.01 KB, application/vnd.oasis.opendocument.text)
2023-09-19 10:08 UTC, J22Gim
Details
if saved as DOCX, you lose part or all of the document (49.88 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2023-09-19 10:11 UTC, J22Gim
Details
screen captures (149.20 KB, application/pdf)
2023-09-19 10:35 UTC, J22Gim
Details
Minimal reproducer (1.05 KB, application/vnd.oasis.opendocument.text)
2023-11-13 11:49 UTC, Mike Kaganski
Details

Note You need to log in before you can comment on or make changes to this bug.
Description J22Gim 2023-09-19 10:04:49 UTC
Description:
I'm working on a DOCX file that a colleague sent me. The file has input fields. I work using Track Changes. 

If I modify any input fields and save as DOCX, the document can't be opened anymore, or at least everything after the modified section is lost.

Steps to Reproduce:
1. open the test file
2. enable track changes and delete "OpenOffice.org" from the text
3. save as DOCX

Actual Results:
The is no message or warning of a serious problem, you can save the document. But when you try to open it again, you see a SAXEception (see screen captures)

Expected Results:
Either a) the document should open without problems (at is does when you save as ODT), or b) there should be a warning that saving as DOCX can result in incorrect file contents and the resulting file may not be opened by LO


Reproducible: Always


User Profile Reset: No

Additional Info:
Version: 7.3.7.2 / LibreOffice Community
Build ID: 30(Build:2)
CPU threads: 24; OS: Linux 6.2; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Ubuntu package version: 1:7.3.7-0ubuntu0.22.04.3
Calc: threaded

But the same happened when I tried with latest available version (7.6.1 as of today)
Comment 1 J22Gim 2023-09-19 10:06:47 UTC
Created attachment 189695 [details]
original file

This is the file I got from my colleague, it has inputs fields.
If you 
1. enable track changes
2. modify the input fields
3. save as DOCX
4. close the file

Then you can't open the file anymore.
Comment 2 J22Gim 2023-09-19 10:08:19 UTC
Created attachment 189696 [details]
the input fields are changed but no problem is saved as ODT

Here I did the changes but saved as ODT. No problem here.
Comment 3 J22Gim 2023-09-19 10:11:51 UTC
Created attachment 189697 [details]
if saved as DOCX, you lose part or all of the document

This is the file after making changes and then saving as DOCX.
No error was present when saving as DOCX. But if you try to open it you get an Exception (see screen captures in separate attachments)
Comment 4 J22Gim 2023-09-19 10:35:42 UTC
Created attachment 189698 [details]
screen captures

step-by-step description with screen captures
Comment 5 Buovjaga 2023-10-02 06:48:13 UTC
(In reply to J22Gim from comment #1)
> Created attachment 189695 [details]
> original file
> 
> This is the file I got from my colleague, it has inputs fields.
> If you 
> 1. enable track changes
> 2. modify the input fields
> 3. save as DOCX
> 4. close the file
> 
> Then you can't open the file anymore.

I can't reproduce on Windows or Linux. I even tried saving to the newer docx format as the original file is 2007 docx.

Version: 24.2.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 9eb419b0b0f019f5fbc48ff1a11977e8b041edee
CPU threads: 2; OS: Windows 10.0 Build 22621; UI render: default; VCL: win
Locale: en-US (en_FI); UI: en-US
Calc: threaded

Arch Linux 64-bit, X11
Version: 7.6.1.2 (X86_64) / LibreOffice Community
Build ID: 60(Build:2)
CPU threads: 8; OS: Linux 6.5; UI render: default; VCL: kf5 (cairo+xcb)
Locale: fi-FI (fi_FI.UTF-8); UI: en-US
7.6.1-1
Calc: threaded
Comment 6 Stéphane Guillou (stragu) 2023-10-27 21:18:40 UTC
(In reply to J22Gim from comment #0)
> Steps to Reproduce:
> 1. open the test file
> 2. enable track changes and delete "OpenOffice.org" from the text
> 3. save as DOCX
...(2010-365)
4. Reload

I get the SAXException in:

Version: 7.3.7.2 / LibreOffice Community
Build ID: e114eadc50a9ff8d8c8a0567d6da8f454beeb84f
CPU threads: 8; OS: Linux 5.15; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Calc: threaded

and:

Version: 7.5.7.1 (X86_64) / LibreOffice Community
Build ID: 47eb0cf7efbacdee9b19ae25d6752381ede23126
CPU threads: 8; OS: Linux 5.15; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Calc: threaded

However, _not_ reproduced with:

Version: 7.6.2.1 (X86_64) / LibreOffice Community
Build ID: 56f7684011345957bbf33a7ee678afaf4d2ba333
CPU threads: 8; OS: Linux 5.15; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Calc: threaded

Can you please test again with 7.6.2.1?
Comment 7 J22Gim 2023-11-02 13:34:23 UTC
Yep, still present:
_________________
An error occurred during opening the file. This may be caused by incorrect file contents.
The error details are:
SAXException: [word/document.xml line 2]: Opening and ending tag mismatch: sdtContent line 2 and del
 at ./sax/source/fastparser/fastparser.cxx:615
Proceeding with import may cause data loss or corruption, and application may become unstable or crash.

Do you want to ignore the error and attempt to continue loading the file?
_______________

and 

__________________
File format error found at 
SAXParseException: '[word/document.xml line 2]: Opening and ending tag mismatch: sdtContent line 2 and del
 at ./sax/source/fastparser/fastparser.cxx:615', Stream 'word/document.xml', Line 2, Column 1856 at ./writerfilter/source/filter/WriterFilter.cxx:213(row,col).
__________________


Version: 7.6.2.1 (X86_64) / LibreOffice Community
Build ID: 60(Build:1)
CPU threads: 24; OS: Linux 6.2; UI render: default; VCL: qt5 (cairo+xcb)
Locale: en-US (en_US.UTF-8); UI: en-US
Ubuntu package version: 4:7.6.2~rc1-0ubuntu0.22.04.1~lo1
Calc: threaded
Comment 8 QA Administrators 2023-11-03 03:15:51 UTC Comment hidden (obsolete)
Comment 9 Roman 2023-11-12 14:22:16 UTC
https://bugs.documentfoundation.org/show_bug.cgi?id=157985
Эх оказывается я запоздал, но у уже один как приятно.
Eh, it turns out I’m late, but I’m already alone, how nice.
Comment 10 Roman 2023-11-12 15:34:19 UTC
> 
> However, _not_ reproduced with:
> 
> Version: 7.6.2.1 (X86_64) / LibreOffice Community
> Build ID: 56f7684011345957bbf33a7ee678afaf4d2ba333
> CPU threads: 8; OS: Linux 5.15; UI render: default; VCL: gtk3
> Locale: en-AU (en_AU.UTF-8); UI: en-US
> Calc: threaded
> 
> Can you please test again with 7.6.2.1?

Уточните наличие поведения при заполненном ФИО
Check if there is any behavior when your full name is filled in
Comment 11 Mike Kaganski 2023-11-12 20:34:17 UTC Comment hidden (obsolete)
Comment 12 Mike Kaganski 2023-11-12 20:55:07 UTC
Username is not important; the bug shows without. The important piece is the range of deletion: in step 2 of comment 0, the selection should start *immediately after "to"*,and go one character after "OpenOffice.org" i.e., it must also select two spaces around the field. Enabling formatting marks can help.

(A remark: please, pay attention to the level of details you provide. A screencast is nice, when shows the reproduction on the same attached documents; and/or exact description, including how you enable track change mode - from menu / from shortcut; how you select - using mouse / keyboard (or maybe you don't select, just put cursor in front of...); how you delete - using Backspace or Delete...) In this case, only the range of selection mattered - but you never know before analysing!
Comment 13 Mike Kaganski 2023-11-12 21:34:12 UTC
Regression after commit b5c616d10bff3213840d4893d13b4493de71fa56 (tdf#104823: support for sdt plain text fields, 2021-12-20). Implementation error.
Comment 14 Mike Kaganski 2023-11-12 21:51:29 UTC
And bit THANK YOU to J22Gim, who finally created the *really* useful report on this really bad issue, resulted in tens of complaints on Ask already, and who knows how many corrupt documents.
Comment 15 Mike Kaganski 2023-11-12 21:52:02 UTC
(In reply to Mike Kaganski from comment #14)

Sorry, BIG, not "bit".
Comment 16 Mike Kaganski 2023-11-13 11:49:38 UTC
Created attachment 190807 [details]
Minimal reproducer

This already has everything needed for the bug to appear - simply save it to DOCX, and the result is broken.
Comment 17 J22Gim 2023-11-13 11:57:02 UTC
(In reply to Mike Kaganski from comment #14)
> And bit THANK YOU to J22Gim, who finally created the *really* useful report
> on this really bad issue, resulted in tens of complaints on Ask already, and
> who knows how many corrupt documents.

Thank YOU to work on this and for this great community software :)
Comment 18 Mike Kaganski 2023-11-13 13:26:19 UTC
How the removal of the control should look like (as generated by MS Word, slightly cleaned up):

> <w:r><w:t>Some text</w:t></w:r>
> <w:del w:id="0" w:author="Mike Kaganski" w:date="2023-11-13T15:10:00Z">
>   <w:r><w:delText xml:space="preserve"> </w:delText></w:r>
> </w:del>
> <w:customXmlDelRangeStart w:id="1" w:author="Mike Kaganski" w:date="2023-11-13T15:10:00Z"/>
> <w:sdt>
>   <w:sdtPr><w:id w:val="738088135"/></w:sdtPr>
>   <w:sdtContent>
>     <w:customXmlDelRangeEnd w:id="1"/>
>     <w:del w:id="2" w:author="Mike Kaganski" w:date="2023-11-13T15:10:00Z">
>       <w:r><w:delText>Deleted text</w:delText></w:r>
>     </w:del>
>     <w:customXmlDelRangeStart w:id="3" w:author="Mike Kaganski" w:date="2023-11-13T15:10:00Z"/>
>   </w:sdtContent>
> </w:sdt>
> <w:customXmlDelRangeEnd w:id="3"/>
> <w:del w:id="4" w:author="Mike Kaganski" w:date="2023-11-13T15:10:00Z">
>   <w:r><w:delText xml:space="preserve"> </w:delText></w:r>
> </w:del>
> <w:r><w:t>More text</w:t></w:r>

How LibreOffice generates it:

> <w:r><w:t xml:space="preserve">Some text </w:t></w:r>
> <w:del w:id="0" w:author="Mike Kaganski" w:date="2023-11-13T14:36:09Z">
>   <w:sdt>
>     <w:sdtPr><w:id w:val="738088135"/></w:sdtPr>
>     <w:sdtContent>
>       <w:r></w:r>
>     </w:del>
>     <w:del w:id="1" w:author="Mike Kaganski" w:date="2023-11-13T14:36:09Z">
>       <w:r><w:delText>Deleted text</w:delText></w:r>
>     </w:sdtContent>
>   </w:sdt>
> </w:del>
> <w:del w:id="2" w:author="Mike Kaganski" w:date="2023-11-13T14:36:09Z">
>   <w:r><w:delText xml:space="preserve"> </w:delText></w:r>
> </w:del>
> <w:r><w:t>More text</w:t></w:r>

It seems that the customXmlDelRangeStart/customXmlDelRangeEnd pairs aren't issued, which are needed to remove the markup; and del elements are terminated where another del is started, without taking level into account.
Comment 19 Stéphane Guillou (stragu) 2023-11-23 11:08:24 UTC
*** Bug 157985 has been marked as a duplicate of this bug. ***
Comment 20 Stéphane Guillou (stragu) 2023-11-23 11:25:49 UTC
Vasily, any chance you could have a look?