Bug 131203 - In the case of import of some DOCX documents with tables with specified table column width, created with Apache POI library, LO throws an exception
Summary: In the case of import of some DOCX documents with tables with specified table...
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
6.1 all versions
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:7.6.0 target:7.5.0.2 target:7.4.6
Keywords: bibisected, bisected, filter:docx
: 150943 (view as bug list)
Depends on:
Blocks: DOCX-SAXParse DOCX-Opening MSO-External-Producers
  Show dependency treegraph
 
Reported: 2020-03-07 12:31 UTC by Artur Linhart
Modified: 2023-03-29 17:56 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments
ZIP file containing all files mentioned in the error description (34.86 KB, application/zip)
2020-03-07 12:31 UTC, Artur Linhart
Details
Screenshot of the problem in current bibisect-6.5 repo (46.90 KB, image/png)
2020-03-26 09:17 UTC, NISZ LibreOffice Team
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Artur Linhart 2020-03-07 12:31:48 UTC
Created attachment 158466 [details]
ZIP file containing all files mentioned in the error description

In the case of the documents, created by the Apache POI library from Java (version 4.1.2), where are specified the sizes of the table columns
(used statement in java - loop over columns j and rows i - are 

XWPFTableRow row = locTable.getRow(i);
XWPFTableCell cell = row.getCell(j);
cell.getCTTc().addNewTcPr().addNewTcW().setW(BigInteger.valueOf(aColumnSize[j]));

the produced document is readable by MS Word and the columns are set corectly, but in the case the document is imported into LibreOffice, the program crashes (disappears) and if opening in the already opened Libre Office, Error about the incorrect file content is displayed and if the answer is "No" to the conencted question if the error has to be ignored, then the second error is shown with the error N4o3tl14divide_by_zeroE: divide by zero.

If I select the option "yes" to ignore the error, then the document is opened and displays something different - after the header of the first table is displayed nothing, also the first table headers are written not in table columns, but below themselves and the rest of the document is not displayed at all.

In the attachments are screenshots of both error messages, and the document, where the problem occurs. I have also generated the document again, where the only difference is, the group properties of the tables are not specified explicitly (then the file is opened in LE as expected), so the content of the both files can be compared in order to locate the exact problem.

I thought this could be problem of the POI library, but because MS Word displays the file without problems and LibreOffice crashes with this ugly error, I guess the problem can be in the importer module of DOCX files.
Comment 1 Roman Kuznetsov 2020-03-07 13:02:34 UTC
For first file with name "LibreOffice-ImportError-FileWithColumnWidthSpecifiedThrowingError":

if I select Yes, then file opens but without any table
if I select No, then I got next message:

File format error found at [mscx_uno bridge error] UNO type of C++ exception unknown: "o3tl.divide_by_zero", RTTI-name=".?AUdivide_by_zero@o3tl@@"!
SAXParseException: '[word/document.xml line 2]: unknown error', Stream 'word/document.xml', Line 2, Column 10007(row,col).

but without crash

Second file with name "LibreOffice-ImportError-FileWithColumnWidthUnspecifiedImportingWell" just opens 

I tried it in

Version: 6.4.0.3 (x64)
Build ID: b0a288ab3d2d4774cb44b62f04d5d28733ac6df8
CPU threads: 4; OS: Windows 10.0 Build 17763; UI render: GL; VCL: win; 
Locale: ru-RU (ru_RU); UI-Language: en-US
Calc: threaded
Comment 2 Artur Linhart 2020-03-07 14:31:40 UTC
OK, I have expressed myself not so exactly - if I open the file by doubleclicking on the given problematic file, the LE starts opening the document, but then it disappears without saying a word. Additionally after it Libreoffice cannot be started, until the hidden libreoffice process is killed.
Comment 3 Artur Linhart 2020-03-07 14:33:55 UTC
My Libre Office version is 6.1.5.2
Comment 4 Roman Kuznetsov 2020-03-07 14:39:23 UTC
(In reply to Artur Linhart from comment #3)
> My Libre Office version is 6.1.5.2

try newer version like 6.3.5 or 6.4.1
Comment 5 Artur Linhart 2020-03-07 15:18:34 UTC
Unfortunatelly, I am not able top upgrade to the higher version, there is no Debian package available for my operating system til now...
Comment 6 Artur Linhart 2020-03-07 15:19:47 UTC
But the problem is the same like described in your comment:

"File format error found at [mscx_uno bridge error] UNO type of C++ exception unknown: "o3tl.divide_by_zero", RTTI-name=".?AUdivide_by_zero@o3tl@@"!
SAXParseException: '[word/document.xml line 2]: unknown error', Stream 'word/document.xml', Line 2, Column 10007(row,col)."
Comment 7 Dieter 2020-03-16 10:00:09 UTC
(In reply to Artur Linhart from comment #5)
> Unfortunatelly, I am not able top upgrade to the higher version, there is no
> Debian package available for my operating system til now...

I have set the bug's status to 'NEEDINFO'. Please change it back to 'UNCONFIRMED' if the bug is still present in the latest version. Change to RESOLVED WORKSFORME, if the problem went away.

BTW: Have you checked downloads on LO page? https://www.libreoffice.org/download/download/
Comment 8 NISZ LibreOffice Team 2020-03-26 09:17:22 UTC
Created attachment 159034 [details]
Screenshot of the problem in current bibisect-6.5 repo

File format error found at [mscx_uno bridge error] UNO type of C++ exception unknown: "o3tl.divide_by_zero", RTTI-name=".?AUdivide_by_zero@o3tl@@"!
SAXParseException: '[word/document.xml line 2]: unknown error', Stream 'word/document.xml', Line 2, Column 10007(row,col).

Repro with:
Version: 7.0.0.0.alpha0+ (x64)
Build ID: bc898e2c2784e36ad4d4cdf6d962e39069d2c82d
CPU threads: 4; OS: Windows 6.3 Build 9600; UI render: GL; VCL: win; 
Locale: en-US (hu_HU); UI-Language: en-US
Calc: CL
Comment 9 Xisco Faulí 2020-03-27 16:46:42 UTC
The error message at import time started to be displayed after https://cgit.freedesktop.org/libreoffice/core/commit/?id=975884fbbc3f80a634258ee562037688a42027a9

@Caolán, any idea why the error is displayed ?
Comment 10 Caolán McNamara 2020-03-27 17:35:31 UTC
thrown from writerfilter/source/dmapper/DomainMapperTableManager.cxx:777
Comment 11 Caolán McNamara 2020-03-27 17:40:55 UTC
There's no crash as such in master so I have no special insight, maybe László has an idea pCellWidths is filled with zeros and if it should have different values, or if we should no attempt to divide by zero and do something more sensible that throw when we find ourself in this position.
Comment 12 QA Administrators 2022-11-24 03:41:56 UTC Comment hidden (obsolete, spam)
Comment 13 Artur Linhart 2023-01-05 11:24:12 UTC
Hello, I have performed the retest of the attached file and the problem appears not to be solved.

I have another linux debian (bullseye, before it was buster) and also the LE has now different version, exact version info is here:

Version: 7.0.4.2
Build ID: 00(Build:2)
CPU threads: 16; OS: Linux 5.10; UI render: default; VCL: kf5
Locale: cs-CZ (cs_CZ.UTF-8); UI: en-US
Debian package version: 1:7.0.4-4+deb11u4
Calc: threaded

After opening the file through doubleclick following message comes:


An error occurred during opening the file. This may be caused by incorrect file contents.
The error details are:
SAXException: [word/document.xml line 2]: unknown error ./sax/source/fastparser/fastparser.cxx:588
Proceeding with import may cause data loss or corruption, and application may become unstable or crash.
Do you want to ignore the error and attempt to continue loading the file?


if I answer yes, the file is opened without table. If I select "No" then following message is displayed:


File format error found at C++ code threw N4o3tl14divide_by_zeroE: divide by zero ./bridges/source/cpp_uno/gcc3_linux_x86-64/uno2cpp.cxx:243
SAXParseException: '[word/document.xml line 2]: unknown error ./sax/source/fastparser/fastparser.cxx:588', Stream 'word/document.xml', Line 2, Column 10007 ./writerfilter/source/filter/WriterFilter.cxx:213(row,col).


Now LE is not crashing, I ab able to work with the app normally after both actions.
Comment 14 Commit Notification 2023-01-05 14:39:52 UTC
László Németh committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/e17b4df3fe5441ca66e4203c725a578eb1797eb2

tdf#131203 DOCX import: fix lost table when w:tblGrid is missing

It will be available in 7.6.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 15 László Németh 2023-01-05 14:40:59 UTC
@Artur & all: thanks for reporting and QA & feadback!
Comment 16 Commit Notification 2023-01-06 10:33:06 UTC
László Németh committed a patch related to this issue.
It has been pushed to "libreoffice-7-5":

https://git.libreoffice.org/core/commit/e9a3755182db2a0e06278977e9d8af376ac4eefa

tdf#131203 DOCX import: fix lost table when w:tblGrid is missing

It will be available in 7.5.0.2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 17 Commit Notification 2023-01-10 11:54:41 UTC
László Németh committed a patch related to this issue.
It has been pushed to "libreoffice-7-4":

https://git.libreoffice.org/core/commit/02de07ffd65d40d26bfc15783ec25030117d5761

tdf#131203 DOCX import: fix lost table when w:tblGrid is missing

It will be available in 7.4.5.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 18 Xisco Faulí 2023-01-24 10:36:25 UTC
7.4.5 was a hotfix release, updating target in status-whiteboard
Comment 19 NISZ LibreOffice Team 2023-01-25 13:21:44 UTC
VERIFIED IN:
Version: 7.6.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 0bb90afaeb193181d7b98b79e962549d8a1dd85a
CPU threads: 8; OS: Windows 10.0 Build 19044; UI render: Skia/Vulkan; VCL: win
Locale: hu-HU (hu_HU); UI: en-US
Calc: CL threaded
Comment 20 Gabor Kelemen (allotropia) 2023-03-29 08:58:07 UTC
*** Bug 150943 has been marked as a duplicate of this bug. ***