Bug 140137 - FILEOPEN Cannot open .docx in writer - format openXML
Summary: FILEOPEN Cannot open .docx in writer - format openXML
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
5.0.0.5 release
Hardware: All All
: medium normal
Assignee: Aron Budea
URL:
Whiteboard: target:7.2.0 target:7.1.2 target:7.0.5
Keywords: bibisected, bisected, filter:docx, regression
Depends on:
Blocks: DOCX-Tables DOCX-Opening MSO-External-Producers
  Show dependency treegraph
 
Reported: 2021-02-03 18:45 UTC by Jan Chroust
Modified: 2021-12-22 19:23 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
screenshot with error when open .docx in witer - format open XML (32.36 KB, image/png)
2021-02-03 18:48 UTC, Jan Chroust
Details
validation .docx in open SDK Productivity Tool 2.5 (66.07 KB, image/png)
2021-02-03 18:50 UTC, Jan Chroust
Details
generated .docx (899.52 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2021-02-03 18:51 UTC, Jan Chroust
Details
PDF of the sample, exported from Word (751.44 KB, application/pdf)
2021-02-27 01:12 UTC, Aron Budea
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jan Chroust 2021-02-03 18:45:51 UTC
Description:
I have a document in open XML .docx format generated from a web application. This document can be opened without problems in MS Word 2016 and 2019 and can be opened without problems, eg in OpenOffice. However, if I want to open it in LibreOffice Writter v. 7.0.4.2, an error (screenshot_1)

I tried to go the document through the Open XML SDK 2.5 Productivity Tool validation and it was without errors. (screenshot_2)





Steps to Reproduce:
Just open .docx file

Actual Results:
When writer starts, it crash with a bug (screenshot_1). When I click on OK

Expected Results:
Open .docx file correctly



Reproducible: Always


User Profile Reset: No



Additional Info:
nothing
Comment 1 Jan Chroust 2021-02-03 18:48:54 UTC
Created attachment 169451 [details]
screenshot with error when open .docx in witer - format open XML
Comment 2 Jan Chroust 2021-02-03 18:50:10 UTC
Created attachment 169452 [details]
validation .docx in open SDK Productivity Tool 2.5
Comment 3 Jan Chroust 2021-02-03 18:51:30 UTC
Created attachment 169453 [details]
generated .docx
Comment 4 raal 2021-02-03 19:22:01 UTC
I can confirm with Version: 7.2.0.0.alpha0+ / LibreOffice Community
Build ID: 40b56cd8da8c38582dc4660b486993d1b4711535
CPU threads: 4; OS: Linux 4.15; UI render: default; VCL: gtk3
Locale: cs-CZ (cs_CZ.UTF-8); UI: en-US
Calc: threaded

Regression. In LO 4.4 I can open file - all 13 pages. I tried to bibisect with bibisect-44max$ under linux, but lots of crashes. Bibisect unsuccessful.
Comment 5 Mike Kaganski 2021-02-04 04:29:27 UTC
The file gas in its tables:

  <w:tblGrid>
      <w:gridCol/>
      <w:gridCol/>
  </w:tblGrid>

We fail to handle missing "w" attribute of w:gridCol. It should be handled as if there's no saved grid width information, while we handle it as explicit zero-width column.

(Aside: then why putting it into the file? This doesn't change the fact that this is still a bug.)
Comment 6 Mike Kaganski 2021-02-04 05:56:40 UTC
... and also there are tables where no table width information (neither in grid, nor in tblW, nor in tcW) exists - we apparently don't handle that gracefully.
Comment 7 Aron Budea 2021-02-27 01:11:02 UTC
(In reply to raal from comment #4)
> Regression. In LO 4.4 I can open file - all 13 pages. I tried to bibisect
> with bibisect-44max$ under linux, but lots of crashes. Bibisect unsuccessful.
Bibisected the change to a reasonably complete file open to an almost empty imported document to the following range using repo bibisect_win_44:
https://cgit.freedesktop.org/libreoffice/core/log/?qt=range&q=eed1ea797acfe69af587adbefe60316ba6ba127..78e670b3055f92740402803174d61d058effb5d7

In Linux the crashes start with the following commit (document looks reasonably fine before), which is the second to latest commit in the list above, and is a likely candidate:
https://cgit.freedesktop.org/libreoffice/core/commit/?id=2149e924cbc32c370128c5f87a4f55c50c99e6bd
author		Caolán McNamara <caolanm@redhat.com>	2014-11-01 20:37:30 +0000
committer	Caolán McNamara <caolanm@redhat.com>	2014-11-01 21:02:15 +0000

coverity#1000600 Division or modulo by float zero
Comment 8 Aron Budea 2021-02-27 01:12:51 UTC
Created attachment 170098 [details]
PDF of the sample, exported from Word
Comment 9 Commit Notification 2021-03-01 09:18:39 UTC
Aron Budea committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/67d41607ad3b97abbb939a989e491af932e985a7

tdf#140137 Don't throw exception when w:gridCol is missing "w" attr

It will be available in 7.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 10 Aron Budea 2021-03-01 09:53:42 UTC
Fixed in master, backport to 7.1 is on gerrit.
Comment 11 Commit Notification 2021-03-01 11:10:01 UTC
Aron Budea committed a patch related to this issue.
It has been pushed to "libreoffice-7-1":

https://git.libreoffice.org/core/commit/526c0a35bc6863e1f9356f237195c804ebd58feb

tdf#140137 Don't throw exception when w:gridCol is missing "w" attr

It will be available in 7.1.2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 12 Commit Notification 2021-03-01 14:23:26 UTC
Aron Budea committed a patch related to this issue.
It has been pushed to "libreoffice-7-0":

https://git.libreoffice.org/core/commit/102ddaa04193a3303e4d3d3e2193048aad3dc16a

tdf#140137 Don't throw exception when w:gridCol is missing "w" attr

It will be available in 7.0.6.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 13 Commit Notification 2021-03-01 14:36:57 UTC
Aron Budea committed a patch related to this issue.
It has been pushed to "libreoffice-7-0-5":

https://git.libreoffice.org/core/commit/31c1f9dd56184307db0743b0b4bef9510e601c1a

tdf#140137 Don't throw exception when w:gridCol is missing "w" attr

It will be available in 7.0.5.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 14 raal 2021-12-22 19:23:13 UTC
Verifed Version: 7.4.0.0.alpha0+ / LibreOffice Community
Build ID: d47628f287f4377394c4ff488c433bfe254b6abe
CPU threads: 4; OS: Linux 5.11; UI render: default; VCL: gtk3
Locale: cs-CZ (cs_CZ.UTF-8); UI: en-US
Calc: threaded