Bug 149798 - Error when opening DOCX after saving with captions on images
Summary: Error when opening DOCX after saving with captions on images
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
4.1.0.4 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: filter:docx
Depends on:
Blocks: DOCX-SAXParse DOCX-Opening
  Show dependency treegraph
 
Reported: 2022-07-01 06:22 UTC by Mouse Y.
Modified: 2023-06-02 11:29 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
This is the Word OOXML file that has been trouble. (1.32 MB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2022-07-01 06:27 UTC, Mouse Y.
Details
This the "Confirmation" error dialog box that appears. (11.31 KB, image/png)
2022-07-01 06:30 UTC, Mouse Y.
Details
1 page DOCX (25.11 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2022-07-06 14:31 UTC, Timur
Details
149798 Logic 1p.pdf: How it looks in LO 7.6+ (notice extra spacing in cell height) (49.09 KB, application/pdf)
2023-06-02 11:29 UTC, Justin L
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mouse Y. 2022-07-01 06:22:53 UTC
Description:
This was a Word OOXML file I got for school, and I figures the LibreOffice community ought to know about it so LibreOffice's handling of OOXML files could be improved. I rarely leave bug reports, so please bear with me.

With this particular OOXML .docx file, whenever I put data inside the these text fields, first seen on page 6, then save the file as a OOXML file, then try to open the file again, I get an error message and apparent damage to the file.

Exporting the file as a PDF from the Save As dialog yielded strange results too, but that is minimal.

I am using LibreOffice 7.3.2.2 x64 on Windows 10. I haven't tested this on MacOS or Linux.

Steps to Reproduce:
1. Open this .docx file in LibreOffice Writer.
2. Scroll down until you find a table that has this text field inside of it. The first one you'll find is on page 6.
3. Type something in those text fields.
4. Save the document as a "Word 2007-365 Document." Notice the save appears successful.
5. Close the document or LibreOffice entirely.
6. Open your newly saved file with LibreOffice again.

Actual Results:
A "Confirmation" dialog box appeared which starts off with "An error occurred during opening the file. This may be caused by incorrect file contents." If you click "Yes" to have LibreOffice ignore the error and reload the file, the table and text fields that had the stuff you typed in is gone, along with everything that came after it.

Expected Results:
Opening the document without errors, and with all the data inside the document intact. Essentially, the same or at least similar to what I was just working on. Yes, I know OOXML is complete garbage.


Reproducible: Always


User Profile Reset: Yes


OpenGL enabled: Yes

Additional Info:
By the way, should the error message say "An error occurred _while_ opening the file" instead of "during?" I feel like that's a grammatical error.

How I worked around this kind-of but nor really:
1. Save as an OpenDocument Text file. That worked. Unfortunately, my instructor wouldn't let me turn in a .odt file.
2. Export as a PDF file. That also worked. My instructor did let me turn one of these in. Although, at one time, it killed everything inside the document and left nothing but a single text field. That happened when exporting as a PDF through the menu and using the defaults. It didn't happen with the .odt file however.
3. Saving as a Word 97-2003 document, although the stuff typed in to those text fields are gone - as a matter of fact, the fields were completely gone; and the formatting is all messed up. It left the tables, though.
Comment 1 Mouse Y. 2022-07-01 06:27:27 UTC
Created attachment 181050 [details]
This is the Word OOXML file that has been trouble.
Comment 2 Mouse Y. 2022-07-01 06:30:39 UTC
Created attachment 181051 [details]
This the "Confirmation" error dialog box that appears.
Comment 3 Mike Kaganski 2022-07-01 07:21:54 UTC
Repro using Version: 7.3.4.2 (x64) / LibreOffice Community
Build ID: 728fec16bd5f605073805c3c9e7c4212a0120dc5
CPU threads: 12; OS: Windows 10.0 Build 19044; UI render: Skia/Raster; VCL: win
Locale: ru-RU (ru_RU); UI: en-US
Calc: CL
Comment 4 Timur 2022-07-06 14:31:23 UTC
Created attachment 181139 [details]
1 page DOCX

(In reply to Mouse Y. from comment #0)
> 2. Scroll down until you find a table that has this text field inside of it.
> The first one you'll find is on page 6.
> 3. Type something in those text fields.

Well, it's not so simple. These are not text fields at all. They are pictures in MSO. So text cannot be entered there.
Adding text in LO is setting caption. And that's what creates a problem in LO, MSO cannot open saved file at all. 

An error occurred during opening the file in LO: 
SAXException: [word/document.xml line 20]: Namespace prefix pic on bodyPr is not defined

Since it's 16 pages DOCX, I attach here 1 page created in MSO.
LO anchors "As Char" so fileopen doesn't look nice. There are other bugs on this.
Comment 5 Timur 2022-07-06 14:41:41 UTC
With LO 3.6, text wasn't saved properly but file could be opened. 
With 4.1 and 7.5+ error in LO and MSO cannot open.
Comment 6 Mouse Y. 2022-07-07 03:26:24 UTC
(In reply to Timur from comment #4)
> (In reply to Mouse Y. from comment #0)
> > 2. Scroll down until you find a table that has this text field inside of it.
> > The first one you'll find is on page 6.
> > 3. Type something in those text fields.
> 
> Well, it's not so simple. These are not text fields at all. They are
> pictures in MSO. So text cannot be entered there.

Really? Well, I never tested this file in MS Word, so I'll take your word for it.

> Adding text in LO is setting caption. And that's what creates a problem in
> LO, MSO cannot open saved file at all. 

That's actually pretty interesting. I just now looked at those alleged text fields in both your file and in my original one in LibreOffice, and yes, they are in fact images. I looked further into it, and now I'm thinking they are AutoShapes (or just a Basic Shape in LibreOffice) or something like them because I could still add text to the center of them, just like an AutoShape.

How I was able to type text in those things was through double-clicking the shape, then a text cursor appeared and I was able to type something in there.

In any case, LibreOffice seems to treat the text as a caption. Good to know!
Comment 7 Justin L 2023-06-02 11:00:26 UTC
I cannot reproduce opening errors, either on the original file or after LO round-trips it. Tested with LO 7.6, 7.3.8, and 7.2.0 as well as MSO 2010.
Comment 8 Justin L 2023-06-02 11:29:06 UTC
Created attachment 187663 [details]
149798 Logic 1p.pdf: How it looks in LO 7.6+ (notice extra spacing in cell height)

Actionable item: the whatever-shapes have an extra border space. (Wrap - edit - bottom border).

I wonder if there is some very old legacy issue here, because MS Word honours that space on the round-tripped file, but any edit of the object using MS Word  throws it out.

Note: Using MS Word 2010 on the original "149798 Logic 1p.docx", when I right-click on one of these shapes, my only edit action is "Edit picture", which prompts "This is an imported picture, not a group. Do you want to convert it to a Microsoft Office drawing object?"