Bug 140437 - FILEOPEN: Writer document "Read Error. Format error discovered in the file in sub-document content.xml at 2,1311816(row,col)."
Summary: FILEOPEN: Writer document "Read Error. Format error discovered in the file in...
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
7.1.0.3 release
Hardware: All All
: high major
Assignee: Michael Stahl (allotropia)
URL:
Whiteboard: target:7.2.0 target:7.1.1
Keywords: bibisected, bisected, filter:odt, regression
Depends on:
Blocks:
 
Reported: 2021-02-15 16:47 UTC by toobuntu
Modified: 2021-02-27 07:25 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments
Affected Writer document (100.06 KB, application/vnd.oasis.opendocument.text)
2021-02-15 21:47 UTC, toobuntu
Details

Note You need to log in before you can comment on or make changes to this bug.
Description toobuntu 2021-02-15 16:47:02 UTC
Description:
This happens for a Writer document.

Steps to Reproduce:
1. Just try opening the document and the read error appears.

Actual Results:
The read error appears immediately:
Read Error. Format error discovered in the file in sub-document content.xml at 2,1311816(row,col).

Expected Results:
The document should have been opened without error, like it does in at least LibreOffice 7.0.4.2 and older.


Reproducible: Always


User Profile Reset: Yes


OpenGL enabled: Yes

Additional Info:
Searching within Atom text editor (Ctrl-G) for position 2:1311816 in content.xml indicates the error might have to do with document styles.  That search lands the cursor between the 0 and 2 in the following:
<text:span text:style-name="T302">. </text:span>

T302 seems to be defined elsewhere in content.xml thusly:
<style:style style:name="T302" style:family="text"><style:text-properties style:font-name="Century1" fo:font-size="10.5pt" style:font-size-asian="10.5pt" style:font-size-complex="10.5pt"/></style:style>

Opens without error in a Linux VM with:
Version: 6.1.5.2
Build ID: 1:6.1.5-3+deb10u6
CPU threads: 2; OS: Linux 4.9; UI render: default; VCL: x11;
Locale: en-US (en_US.UTF-8); Calc: group threaded

Opens without error in a Linux VM with:
Version: 7.0.4.2
Build ID: 00(Build:2)
CPU threads: 2; OS: Linux 4.9; UI render: default; VCL: x11
Locale: en-US (en_US.UTF-8); UI: en-US
Debian package version: 1:7.0.4~rc2-1~bpo10+2
Calc: threaded

Fails to open and immediately throws an error on macOS with:
Version: 7.1.0.3 / LibreOffice Community
Build ID: f6099ecf3d29644b5008cc8f48f42f4a40986e4c
CPU threads: 12; OS: Mac OS X 10.15.7; UI render: default; VCL: osx
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded
Comment 1 Regina Henschel 2021-02-15 17:09:11 UTC
In case of a damaged file it is very import to have a description how the file was produced. Please add that.
Comment 2 toobuntu 2021-02-15 17:48:18 UTC
(In reply to Regina Henschel from comment #1)
> In case of a damaged file it is very import to have a description how the
> file was produced. Please add that.

The .odt file was created, according to meta.xml, with <meta:generator>LibreOffice/6.4.5.2$MacOSX_X86_64 LibreOffice_project/a726b36747cf2001e06b58ad5db1aa3a9a1872d6</meta:generator>

The document was created in LibreOffice Writer.  I'll leave it to the dev team to diagnose, but it seems unlikely that the file is damaged being that it opens just fine in earlier versions of LibreOffice.

Please let me know if there is any other information you require.

Perhaps the following additional excerpts from meta.xml will be useful:
<meta:editing-cycles>646</meta:editing-cycles><meta:print-date>2020-05-26T18:20:10.774935080</meta:print-date><meta:creation-date>2013-07-30T22:20:00</meta:creation-date><dc:date>2020-07-14T11:27:27.014528812</dc:date><meta:editing-duration>P20DT3H7M12S</meta:editing-duration><meta:generator>LibreOffice/6.4.5.2$MacOSX_X86_64 LibreOffice_project/a726b36747cf2001e06b58ad5db1aa3a9a1872d6</meta:generator><meta:document-statistic meta:table-count="23" meta:image-count="0" meta:object-count="0" meta:page-count="71" meta:paragraph-count="575" meta:word-count="23166" meta:character-count="141504" meta:non-whitespace-character-count="118896"/><meta:user-defined meta:name="AppVersion">12.0000</meta:user-defined>
* * *
<meta:template xlink:type="simple" xlink:actuate="onRequest" xlink:title="Normal.dotm" xlink:href=""/></office:meta></office:document-meta>

It could be that the Normal.dotm which contains the document styles from which this file was based was indeed created in 2013, but this particular file was created in either 2019 or 2020 and has not been edited since July 2020.
Comment 3 Regina Henschel 2021-02-15 18:00:22 UTC
(In reply to toobuntu from comment #2)
> The document was created in LibreOffice Writer.  I'll leave it to the dev
> team to diagnose, but it seems unlikely that the file is damaged being that
> it opens just fine in earlier versions of LibreOffice.

Do you know, which version can open the file?

In any case, please attach the file.
Comment 4 toobuntu 2021-02-15 19:25:24 UTC
(In reply to Regina Henschel from comment #3)
> (In reply to toobuntu from comment #2)
> > The document was created in LibreOffice Writer.  I'll leave it to the dev
> > team to diagnose, but it seems unlikely that the file is damaged being that
> > it opens just fine in earlier versions of LibreOffice.
> 
> Do you know, which version can open the file?
> 
> In any case, please attach the file.

Yes, per the OP it opens in 7.0.4.2 but not in 7.1.0.3.

Unfortunately, the file cannot be provided on a public bug tracker because it is law firm work product and contains information protected by attorney-client privilege.  We would be happy to look into the file's content.xml or metadata and report back with whatever you need.  How can we help?
Comment 5 mulla.tasanim 2021-02-15 20:49:29 UTC
Thank you for reporting the bug. Please attach a sample document, as this makes it easier for us to verify the bug. 
I have set the bug's status to 'NEEDINFO'. Please change it back to 'UNCONFIRMED' once the requested document is provided.
(Note that the attachment will be public, remove any sensitive information before attaching it.
See the QA FAQ Wiki for further detail.)
Comment 6 Regina Henschel 2021-02-15 20:56:41 UTC
You can try https://wiki.documentfoundation.org/QA/Bugzilla/Sanitizing_Files_Before_Submission. If the transformed file still shows the error, you can attach the transformed file.

Without a file, which shows the error, it is nearly impossible to fix something.

You can "save a copy" of the document in LO 7.0 with file format ODF 1.2, ODF 1.2 extended, ODF 1.3 and ODF 1.3 extended. Then look whether any of the saved files opens in LibreOffice 7.1 and if yes, whether it is complete. If that works, it would be a workaround for you.

You can start with a new document in 7.1, open that in 7.0 and copy&paste the content from the problematic file to the new one.

You can try to delete all not actually used custom styles.

You can use the validator https://odfvalidator.org/ too get more information. Perhaps it points to some wrong elements or attributes. The validator is available for local use from https://odftoolkit.org/conformance/ODFValidator.html, but I have not yet used it locally.

You can try a daily version of 7.2. That can be installed parallel to the normal version. Perhaps there is an error in 7.1, which is already fixed in 7.2. https://dev-builds.libreoffice.org/daily/

You can try to open the file on a computer with a different operating system. The error might depend on OS.
Comment 7 toobuntu 2021-02-15 21:47:02 UTC
Created attachment 169773 [details]
Affected Writer document

I have created a new document in LibreOffice Writer 7.0.4.2 which is a sanitized version of an earlier draft of the file which originally prompted this bug report.  The attachment opens without error in version 7.0.4.2 but cannot be opened with version 7.1.0.3, which immediately throws this error:

Read Error.
Format error discovered in the file in sub-document content.xml at 2,1332373(row,col).


Additionally, I tried to "Save a Copy" of the attachment file in LO 7.0.4.2 as a flat ODF (.fodt) and open it in LO 7.1.0.3, which immediately throws a different error:

General Error.
General input/output error.
Comment 8 toobuntu 2021-02-15 22:07:56 UTC
An additional data point:

The attachment is in the default ODF format version 1.3 Extended format.  In LO 7.0.4.2, I saved a copy of the file in ODF format version 1.3 (not extended) and it was able to open in LO 7.1.0.3 without reporting an error.
Comment 9 Regina Henschel 2021-02-15 22:42:34 UTC
Attached file opens fine in Version: 7.0.4.2 (x64)
Build ID: dcf040e67528d9187c66b2379df5ea4407429775
CPU threads: 8; OS: Windows 10.0 Build 19041; UI render: Skia/Raster; VCL: win
Locale: en-US (en_US); UI: en-US
Calc: CL

It fails to open  in Version: 7.1.0.3 (x64) / LibreOffice Community
Build ID: f6099ecf3d29644b5008cc8f48f42f4a40986e4c
CPU threads: 8; OS: Windows 10.0 Build 19041; UI render: Skia/Raster; VCL: win
Locale: de-DE (en_US); UI: en-US
Calc: CL

Error message here: Read Error.
Format error discovered in the file in sub-document content.xml at 2,1338424(row,col).
Comment 10 Xisco Faulí 2021-02-16 08:44:14 UTC
Regression introduced by:

https://cgit.freedesktop.org/libreoffice/core/commit/?id=dd24e21bb4f183048a738314934fc3f02ec093f1

author	Michael Stahl <Michael.Stahl@cib.de>	2020-10-30 20:30:40 +0100
committer	Michael Stahl <michael.stahl@cib.de>	2020-11-02 15:45:40 +0100
commit	dd24e21bb4f183048a738314934fc3f02ec093f1 (patch)
tree	1374bc6cf16b530d14a8e9af04148b15bf7793f4
parent	f269467ab5b73999c7ae7edbd0d5dd605d006090 (diff)
sw: return SwXFieldmark in SwXFieldEnumeration

Bisected with: bibisect-linux64-7.1

Adding Cc: to Michael Stahl
Comment 11 Commit Notification 2021-02-22 18:11:22 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/d62c93a831080ef332e416dc78f5600c2c5b9850

tdf#140437 ODF import: fix for broken documents with field code as type

It will be available in 7.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 12 Michael Stahl (allotropia) 2021-02-22 18:14:29 UTC
fixed on master
Comment 13 Xisco Faulí 2021-02-23 08:20:57 UTC
Verified in

Version: 7.2.0.0.alpha0+ / LibreOffice Community
Build ID: a1d987cf3d0e1ae4d87f7d06ae93e71a0cc59f0c
CPU threads: 4; OS: Linux 5.7; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded

@Michael Stahl, thanks for fixing this issue!!
Comment 14 Commit Notification 2021-02-23 08:24:24 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "libreoffice-7-1":

https://git.libreoffice.org/core/commit/3a1c4dc5887eee7313fe3f6cb20202df89ac5457

tdf#140437 ODF import: fix for broken documents with field code as type

It will be available in 7.1.2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 15 toobuntu 2021-02-23 09:55:26 UTC
Verified in:

Version: 7.2.0.0.alpha0+ / LibreOffice Community
Build ID: 576c6054d8d445cc977fc3789c572cfc2a3ccd83
CPU threads: 12; OS: Mac OS X 10.15.7; UI render: default; VCL: osx
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded

Furthermore, after successfully opening the affected document in the master branch daily build (7.2), I saved it and was able to open it in 7.1:

Version: 7.1.0.3 / LibreOffice Community
Build ID: f6099ecf3d29644b5008cc8f48f42f4a40986e4c
CPU threads: 12; OS: Mac OS X 10.15.7; UI render: default; VCL: osx
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded

@Michael Stahl, thank you for the quick fix!
Comment 16 toobuntu 2021-02-24 08:35:01 UTC
Verified in:

Version: 7.1.2.0.0+ / LibreOffice Community
Build ID: 67a862f6d146ea7e0ec0789563bf0f51961348c9
CPU threads: 12; OS: Mac OS X 10.15.7; UI render: default; VCL: osx
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded
Comment 17 Commit Notification 2021-02-24 14:26:34 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "libreoffice-7-1-1":

https://git.libreoffice.org/core/commit/85c04fb200469fb88933123358601480450864f4

tdf#140437 ODF import: fix for broken documents with field code as type

It will be available in 7.1.1.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 18 Commit Notification 2021-02-27 07:25:39 UTC
Xisco Fauli committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/c5811750c392a6c46bc973de467926d499794dca

tdf#140437: sw_odfexport: Add unittest

It will be available in 7.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.