Bug 130707 - FILEOPEN: Writer document "Read Error. Format error discovered in the file in sub-document content.xml at 2,68950(row,col)."
Summary: FILEOPEN: Writer document "Read Error. Format error discovered in the file in...
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
6.3.2.1 rc
Hardware: All All
: high major
Assignee: Miklos Vajna
URL:
Whiteboard: target:7.1.0 target:7.0.1 target:6.4....
Keywords: bibisected, bisected, regression
Depends on:
Blocks: File-Opening
  Show dependency treegraph
 
Reported: 2020-02-16 11:51 UTC by go.sandberg
Modified: 2021-02-15 16:51 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
Writer document that cannot be opened (read error) in reported LibreOffice versions (774.24 KB, application/vnd.oasis.opendocument.text)
2020-02-16 13:59 UTC, go.sandberg
Details

Note You need to log in before you can comment on or make changes to this bug.
Description go.sandberg 2020-02-16 11:51:09 UTC
Description:
This happens for Writer document with Calc spreadsheet as included database. Also when the database is removed.
The error appears for LibreOffice 6.3.2.1 to 6.4.0.3 (x64)
It seems working correct in 6.3.1.1 and older

Steps to Reproduce:
1. Just try opening the document, with or wothout the DB, and the read error appears. I will attach the example documents without the included DB (if need for the DB as well then we have to find a secure way for attaching it since it contains personal data).

Actual Results:
The read error appears immediately:
Read Error. Format error discovered in the file in sub-document content.xml at 2,68950(row,col).

Expected Results:
The document should have been opened, like it does in the LibreOffice 6.3.1.1 and older


Reproducible: Always


User Profile Reset: Yes


OpenGL enabled: Yes

Additional Info:
Version: 6.4.0.3 (x64)
Build ID: b0a288ab3d2d4774cb44b62f04d5d28733ac6df8
CPU threads: 8; OS: Windows 10.0 Build 18362; UI render: default; VCL: win; 
Locale: sv-SE (sv_SE); UI-Language: en-US
Calc: threaded
Comment 1 go.sandberg 2020-02-16 13:59:35 UTC
Created attachment 157926 [details]
Writer document that cannot be opened (read error) in reported LibreOffice versions
Comment 2 Oliver Brinzing 2020-02-16 14:49:04 UTC
reproducible:

attached document will open with:

Version: 6.2.8.2 (x64)
Build-ID: f82ddfca21ebc1e222a662a32b25c0c9d20169ee
CPU-Threads: 4; BS: Windows 10.0; UI-Render: Standard; VCL: win; 
Gebietsschema: de-DE (de_DE); UI-Sprache: de-DE
Calc: 

but fails to open with:

Version: 6.3.5.1 (x64)
Build-ID: 9a62adaf9abe90e8fef419f29114b0176dd66801
CPU-Threads: 4; BS: Windows 10.0; UI-Render: Standard; VCL: win; 
Gebietsschema: de-DE (de_DE); UI-Sprache: de-DE
Calc: 

seems to have started with:

https://gerrit.libreoffice.org/plugins/gitiles/core/+/28d67b792724a23015dec32fb0278b729f676736

tdf#107776 sw ODF shape import: make is-textbox check more strict

Regression from commit 9835a5823e0f559aabbc0e15ea126c82229c4bc7 (sw
textboxes: reimplement ODF import/export, 2014-10-04), the problem was
that we assumed graphic autostyles look like:
    <style:style style:name="gr2" style:family="graphic">

for simple (non-text-box) content, and look like:
    <style:style style:name="gr1" style:family="graphic" style:parent-style-name="Frame">

for complex (text-box) content.

Turns out it's valid to have other parent styles as well, e.g. Graphics,
which should not be imported as sw textboxes.

With this, the arrow at the bottom of page 3 of the bugdoc is now again
on top of the image, i.e. layout compatibility is restored.

Change-Id: Icbba8a23c5f66e63090f90e6581ebc98948cb80b
Reviewed-on: https://gerrit.libreoffice.org/78155
Tested-by: Jenkins
Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
										   
/cygdrive/d/sources/bibisect/bibisect-win64-6.4
$ git bisect bad b5fe45c9bd4aed26448fddff2064f43595d5abd6 is the first bad commit
commit b5fe45c9bd4aed26448fddff2064f43595d5abd6
Author: Norbert Thiebaud <nthiebaud@gmail.com>
Date:   Wed Sep 4 04:44:34 2019 -0700

    source 28d67b792724a23015dec32fb0278b729f676736
    source 28d67b792724a23015dec32fb0278b729f676736

:040000 040000 3ced65b7409548e586e456539a31221c3090ddfe 255015855cc0f4a46053a0ef5caabd28d599c2ef M      instdir

/cygdrive/d/sources/bibisect/bibisect-win64-6.4
$ git bisect log
# bad: [75af2782b7f006d1c31ad11e84d5ab6bd7f74ed0] source 20be5cd0bdc57d812bf34a2debfe48caa51de881
# good: [8d1eaf05d47fd1c56ddecbe57a9a7c8289ede7f4] source c98b1f1cd43b3e109bcaf6324ef2d1f449b34099
git bisect start 'master' 'oldest'
# good: [e13037966b62b9d258d5cf6d96586de5e2bafea8] source 4a0b2b8024fa6fb8a0ab3e474b7d64fc455028b5
git bisect good e13037966b62b9d258d5cf6d96586de5e2bafea8
# bad: [7cba838374e1acd7b8a6e114d7b12bf6370cd7ab] source d6ea967e040d01ec69649ac689472018e477db34
git bisect bad 7cba838374e1acd7b8a6e114d7b12bf6370cd7ab
# bad: [f57765d2816781c79fc36e1d4955bff1c5c92f8d] source 130b0ffd491015df2dc1d6b272952d76980f0465
git bisect bad f57765d2816781c79fc36e1d4955bff1c5c92f8d
# bad: [a5806767f5af962cc752a93c633c2a8eb53b7ada] source 17f9aa97f8753b895db30e8080481f5f6d696b82
git bisect bad a5806767f5af962cc752a93c633c2a8eb53b7ada
# bad: [dbd346c63ce5bf99a9d1c46c7e29c78972977a36] source 9c06059ec546683bfa095cf4f59ac6ea94da34fb
git bisect bad dbd346c63ce5bf99a9d1c46c7e29c78972977a36
# bad: [caf6f8b4af35f8f14343e5041a3c9d48b35c786b] source f38566826cdd257214298c583a1ce8ae6715713c
git bisect bad caf6f8b4af35f8f14343e5041a3c9d48b35c786b
# bad: [48d97fced7916f15dd322390f321618e868d92fd] source 0466c3d8b2589a2982b919145cdacc9a1eeb63ff
git bisect bad 48d97fced7916f15dd322390f321618e868d92fd
# good: [6037b377b38871b6c0d4b1c19b916d39a04ae140] source b5317b2a41e6369e2804462e2ab73f1447f1533a
git bisect good 6037b377b38871b6c0d4b1c19b916d39a04ae140
# good: [026b06d4f3ec456b403e7d15838ad97dd57fe453] source 09d29fab72e22ba830f178b15a74a5a87c8a73a5
git bisect good 026b06d4f3ec456b403e7d15838ad97dd57fe453
# bad: [437a426a5079edae06923c2eba69aadaab8db089] source 9c470a376a8cf5d42d2adbe2e528f3fe6b2df7ee
git bisect bad 437a426a5079edae06923c2eba69aadaab8db089
# good: [8fa3dea6409f0eaf294f9b20374a95906406dd8c] source 5c50f1a2d4487b9303974c7cf39d6208192a0c96
git bisect good 8fa3dea6409f0eaf294f9b20374a95906406dd8c
# bad: [ccb3e1875f334d4970d9ed9ecad7a3e8fda68c6c] source a1f2f8bdff465ada957f394530cd791a0c329da4
git bisect bad ccb3e1875f334d4970d9ed9ecad7a3e8fda68c6c
# bad: [b5fe45c9bd4aed26448fddff2064f43595d5abd6] source 28d67b792724a23015dec32fb0278b729f676736
git bisect bad b5fe45c9bd4aed26448fddff2064f43595d5abd6
# first bad commit: [b5fe45c9bd4aed26448fddff2064f43595d5abd6] source 28d67b792724a23015dec32fb0278b729f676736
Comment 3 Julien Nabet 2020-02-16 19:42:54 UTC
On pc Debian x86-64 with master sources updated today, I could reproduce this.

I confirm that reverting https://gerrit.libreoffice.org/plugins/gitiles/core/+/28d67b792724a23015dec32fb0278b729f676736, I can open the file.
(of course, 28d67b792724a23015dec32fb0278b729f676736 may have revealed another bug that must be fixed).
Comment 4 Aron Budea 2020-07-27 00:31:23 UTC
Let's add the magic words. Adding CC: to Miklos Vajna.
Comment 5 Commit Notification 2020-08-04 05:57:26 UTC
Miklos Vajna committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/fd18d12efdfbe0e26d41d733edc711d0f40a7804

tdf#130707 xmloff: survive <text:database-display> in editeng text

It will be available in 7.1.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 6 Commit Notification 2020-08-04 09:34:19 UTC
Miklos Vajna committed a patch related to this issue.
It has been pushed to "libreoffice-7-0":

https://git.libreoffice.org/core/commit/fec8b842d701cca0b79af9ea9f6192a8cf7610b0

tdf#130707 xmloff: survive <text:database-display> in editeng text

It will be available in 7.0.1.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 7 Xisco Faulí 2020-08-04 11:20:41 UTC
Verified in

Version: 7.1.0.0.alpha0+
Build ID: 58937aa4a50ecd681382f03331340da4c843b01e
CPU threads: 4; OS: Linux 4.19; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded

@Miklos, thanks for fixing this issue!!
Comment 8 Commit Notification 2020-08-05 10:10:58 UTC
Miklos Vajna committed a patch related to this issue.
It has been pushed to "libreoffice-6-4":

https://git.libreoffice.org/core/commit/790ab54d7232d232ba84f17f285d4639ddba80d5

tdf#130707 xmloff: survive <text:database-display> in editeng text

It will be available in 6.4.7.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 9 Commit Notification 2020-08-06 14:12:28 UTC
Miklos Vajna committed a patch related to this issue.
It has been pushed to "libreoffice-6-4-6":

https://git.libreoffice.org/core/commit/8ec82ef91f741a19798f8dda280d7fc2356f797a

tdf#130707 xmloff: survive <text:database-display> in editeng text

It will be available in 6.4.6.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 10 toobuntu 2021-02-11 15:30:55 UTC
This is still an issue for me in Writer. First noticed in 7.1.0.3. I probably haven't tried to open the file since July 2020, however.

The .odt file was created, according to meta.xml, with <meta:generator>LibreOffice/6.4.5.2$MacOSX_X86_64 LibreOffice_project/a726b36747cf2001e06b58ad5db1aa3a9a1872d6.

If it helps to know, there is a text box in the document. Unfortunately, the document contains sensitive content and cannot be shared unless sanitized first (but of course it does not open in LibreOffice Writer). FWIW, it does open just fine in macOS TextEdit (in Rich Text mode), although the font and other formatting is slightly off.

How can I help?

Read Error. Format error discovered in the file in sub-document content.xml at 2,1286680(row,col).

Environment
$ soffice --version
LibreOffice 7.1.0.3 f6099ecf3d29644b5008cc8f48f42f4a40986e4c

$ sw_vers
ProductName:	Mac OS X
ProductVersion:	10.15.7
BuildVersion:	19H524
Comment 11 Julien Nabet 2021-02-11 17:18:07 UTC
(In reply to toobuntu from comment #10)
> This is still an issue for me in Writer. First noticed in 7.1.0.3. I
> probably haven't tried to open the file since July 2020, however.
> ...
Please submit a new bugtracker instead of commenting this one.
Indeed the original bug has been fixed and verified. It's perhaps a similar bug but I don't think it's the same.
Comment 12 toobuntu 2021-02-15 16:51:19 UTC
(In reply to Julien Nabet from comment #11)
> (In reply to toobuntu from comment #10)
> > This is still an issue for me in Writer. First noticed in 7.1.0.3. I
> > probably haven't tried to open the file since July 2020, however.
> > ...
> Please submit a new bugtracker instead of commenting this one.
> Indeed the original bug has been fixed and verified. It's perhaps a similar
> bug but I don't think it's the same.

Done: bug 140437. Sorry for the noise.