Bug 116194 - table content from .DOCX shown as text in Writer
Summary: table content from .DOCX shown as text in Writer
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
4.3.0.4 release
Hardware: All All
: high major
Assignee: László Németh
URL:
Whiteboard: target:6.5.0 target:6.4.0.1 target:7.1.0
Keywords: bibisected, bisected, regression
: 111679 120512 122608 125312 129729 (view as bug list)
Depends on:
Blocks: DOCX-Tables
  Show dependency treegraph
 
Reported: 2018-03-05 09:06 UTC by nikisch
Modified: 2020-07-16 17:09 UTC (History)
11 users (show)

See Also:
Crash report or crash signature:


Attachments
doc with truble (14.97 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2018-03-05 09:07 UTC, nikisch
Details
Screenshot this doc in other office (look ok) (45.53 KB, image/png)
2018-03-05 13:56 UTC, nikisch
Details
Compare DOCX in MSO and LO (535.18 KB, image/jpeg)
2018-03-05 14:41 UTC, Timur
Details

Note You need to log in before you can comment on or make changes to this bug.
Description nikisch 2018-03-05 09:06:28 UTC
Description:
In libreoffice table with checkboxes on first page gone. 
In mso and onlyoffice all look ok.

Steps to Reproduce:
1.open my doc
2.
3.

Actual Results:  
The whole document was melted

Expected Results:
Everything should look normal


Reproducible: Always


User Profile Reset: No



Additional Info:


User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:58.0) Gecko/20100101 Firefox/58.0
Comment 1 nikisch 2018-03-05 09:07:23 UTC
Created attachment 140342 [details]
doc with truble
Comment 2 Dieter 2018-03-05 10:25:10 UTC
Can you add an attachment that shows how the document should looks like?
Comment 3 nikisch 2018-03-05 13:56:24 UTC
Created attachment 140353 [details]
Screenshot this doc in other office (look ok)
Comment 4 Timur 2018-03-05 14:41:33 UTC
Created attachment 140359 [details]
Compare DOCX in MSO and LO

Bugzilla is not "document based", like "this document doesn't display nice". 
Bugzilla is "issue based", so a single issue must be pointed at, after a search for not being a duplicate. 
Bug report is not correct but issue exists: table content from .DOCX shown as text in Writer
This docx is old 2007 format but that's not the reason.
Comment 5 Timur 2018-03-05 14:53:59 UTC
This behavior started in 4.3. But also wasn't correct before. 
So not a regression but bibisect would be useful.
Comment 6 nikisch 2018-03-06 05:13:19 UTC
I do not know what the problem is. I hope it's just fixed.
There are many rtf with a similar problem. In them, too, the tables disappear, although they are present in openoffice 4.1.5 or mso or onlyoffice. Attach an example? Is this the same problem?
All these rtf are generated by the Russian program consultant +
Comment 7 Dieter 2018-03-06 08:21:28 UTC
> There are many rtf with a similar problem. In them, too, the tables
> disappear, although they are present in openoffice 4.1.5 or mso or
> onlyoffice. Attach an example? Is this the same problem?
> All these rtf are generated by the Russian program consultant +

I would open a new bug report for the rtf problem. See also the meta-bug 112765 about RTF Table bugs and enhancements
Comment 8 Justin L 2018-08-20 18:05:53 UTC
Used bibisect43max to identify commit cf33af732ed0d3d553bb74636e3b14c55d44c153
    Author:     Lubos Lunák
    CommitDate: Wed Apr 23 14:57:36 2014 +0200
    
handle w:gridBefore by faking cells (fdo#38414)
    
Docx's w:gridBefore means that there should be this given space in the table
grid before any cells come. But writer requires tables to be rectangular, so
the space needs to be faked using cells without border. So far so good, but
now reality in the form of the retarded overdesigned writerfilter comes.
The internal representation of table data (and not just one actually) is
pretty non-obvious and hard to modify, seems to be modelled just to follow
the parser data the way it comes. Moreover dmapper gets notified of w:gridBefore
only after cells in the row have been already processed. So after futile attempts
to add the fake cells somehow in dmapper I've eventually given up and hacked up
input handling to fake input as if the fake cells were actually there (which
was tedious to find out as well, but at least it's reasonably doable).

https://cgit.freedesktop.org/libreoffice/core/commit/?id=cf33af732ed0d3d553bb74636e3b14c55d44c153
Comment 9 Timur 2019-01-09 18:05:42 UTC
*** Bug 122608 has been marked as a duplicate of this bug. ***
Comment 10 raal 2019-01-09 18:48:40 UTC
(In reply to Justin L from comment #8)
> Used bibisect43max to identify commit
> cf33af732ed0d3d553bb74636e3b14c55d44c153
>     Author:     Lubos Lunák
>     CommitDate: Wed Apr 23 14:57:36 2014 +0200
> 
> https://cgit.freedesktop.org/libreoffice/core/commit/
> ?id=cf33af732ed0d3d553bb74636e3b14c55d44c153


Adding Cc: to Luboš Luňák
Comment 11 Timur 2019-01-10 08:10:23 UTC
Another example: attachment 148176 [details] DOCX from Bug 122608 that should look like attachment 148177 [details] PDF.
Comment 12 Timur 2019-01-10 09:37:37 UTC
*** Bug 111679 has been marked as a duplicate of this bug. ***
Comment 13 Timur 2019-01-10 09:38:27 UTC
Also DOCX attachment 135443 [details] from 111679 Bug.
Comment 14 Timur 2019-06-06 13:08:23 UTC
*** Bug 120512 has been marked as a duplicate of this bug. ***
Comment 15 Timur 2019-06-06 13:08:30 UTC
*** Bug 120256 has been marked as a duplicate of this bug. ***
Comment 16 Xisco Faulí 2019-06-06 14:03:51 UTC
regression from LibreOffice 4.3, I don't think it deserves high priority at this point...
Comment 17 Gabor Kelemen (allotropia) 2019-06-06 16:27:11 UTC
(In reply to Xisco Faulí from comment #16)
> regression from LibreOffice 4.3, I don't think it deserves high priority at
> this point...

It's actually a quite bad one when looking at the consequences. Several duplicates too - I'd say high priority is warranted.
Comment 18 Commit Notification 2019-12-08 14:55:00 UTC
László Németh committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/da1f71edfc72928b07a569b98e2766a8a7de9d2a

tdf#116194 DOCX import: fix missing tables with w:gridBefore

It will be available in 6.5.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 19 László Németh 2019-12-08 14:59:14 UTC
Fixed in the following commits:

https://gerrit.libreoffice.org/plugins/gitiles/core/+/b2c6d2d961a6113d0f111fab45ae12a40d389a23%5E%21
https://gerrit.libreoffice.org/plugins/gitiles/core/+/b2c6d2d961a6113d0f111fab45ae12a40d389a23%5E%21

Their descriptions:
-----------------------------------------------------------------
tdf#116194 DOCX import: fix missing tables with w:gridBefore

Regression from the commit cf33af732ed0d3d553bb74636e3b14c55d44c153
"handle w:gridBefore by faking cells (fdo#38414)"

This patch replaces the previous fix with a better solution,
fixing tdf#38414 on the proposed DomainMapper level. (Note:
to reject the old fix completely, its follow-up commit w:gridAfter
will be handled in a similar way.)

Now the related regressions, tdf#111679, tdf#120512 and the complex
forms of tdf#116194, tdf120256 and tdf#122608 are fixed, too.

----------------------------------------------------------------
fdo#38414 tdf#44986: DOCX table import: handle gridBefore/After

without serious regressions, ie. losing the import of complex
forms with multiple or nested tables.

Complete the fix for tdf#116194 (DOCX import: fix missing
tables with w:gridBefore) with handling gridAfter on
DomainMapper level.

This consists of also rejections (except their unit tests) of

commit cf33af732ed0d3d553bb74636e3b14c55d44c153
(handle w:gridBefore by faking cells (fdo#38414)) and

commit 1d1748d143ab4270a2ca1b5117852b1b1bb4c526 (Related:
tdf#44986 DOCX import: handle w:gridAfter by faking cells)
Comment 21 Commit Notification 2019-12-08 21:31:14 UTC
László Németh committed a patch related to this issue.
It has been pushed to "libreoffice-6-4":

https://git.libreoffice.org/core/commit/70274f86cdc1c023ffdd0130c262c1479262d76b

tdf#116194 DOCX import: fix missing tables with w:gridBefore

It will be available in 6.4.0.1.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 22 Timur 2019-12-09 12:27:12 UTC
attachment 140342 [details] from this bug opens with table. There's small difference in table cell content, but that's some other bug, maybe bug 94801.
attachment 148176 [details] DOCX from Bug 122608 is also OK.
Also DOCX attachment 135443 [details] from 111679 Bug that looked like  attachment 135444 [details].
And attachment 145601 [details] from 120512.

Attachment 145308 [details] from bug 120256 now has table, but may not be quite correct so I'll reopen.

László and NISZ, *thanks* for this major fix (and fix^2) that affects many users. I set Verified.

Note: there's the nasty attachment 123770 [details] from bug 104347 that's marked as regression from tdf#44986 but explained that's not, the commit just uncovered an existing layout problem. So it's still wrong.
Comment 23 László Németh 2019-12-10 18:01:34 UTC
@Timur: many thanks for the detailed verification and your kind feedback! I'll checked the suggested bug 104347, I've already fixed a similar problem a few months ago, and I hope, with the upcoming table fixes, we will be able to fix that, too. Thanks again for your help!
Comment 24 Fahad Al-Saidi 2019-12-10 18:06:45 UTC
Hi @László please consider looking into #120256 and complete your great work. Thanks :-)
Comment 25 NISZ LibreOffice Team 2019-12-20 09:36:03 UTC
*** Bug 125312 has been marked as a duplicate of this bug. ***
Comment 26 NISZ LibreOffice Team 2020-01-06 07:59:53 UTC
*** Bug 129729 has been marked as a duplicate of this bug. ***
Comment 27 Commit Notification 2020-07-16 17:09:19 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/5087a64842b3e4c96905cc8a9304ec0154ea0d11

NFC tdf#116194 writerfilter: cleanup unused gridBefore pieces

It will be available in 7.1.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.