Bug 58944 - FILEOPEN particular .docx wrongly shows lots of table heading rows to be repeated.
Summary: FILEOPEN particular .docx wrongly shows lots of table heading rows to be repe...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Mike Kaganski
URL:
Whiteboard: target:6.2.0 target:6.3.0
Keywords:
: 51785 55917 114715 (view as bug list)
Depends on:
Blocks: Writer-Tables
  Show dependency treegraph
 
Reported: 2013-01-02 13:41 UTC by Laubrino
Modified: 2020-11-09 12:59 UTC (History)
13 users (show)

See Also:
Crash report or crash signature:


Attachments
test document (144.03 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2013-01-02 13:41 UTC, Laubrino
Details
Sample DOCX (16.25 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2018-05-12 12:31 UTC, Aron Budea
Details
Sample ODT (created in Word) (5.78 KB, application/vnd.oasis.opendocument.text)
2018-05-12 12:34 UTC, Aron Budea
Details
test document saved in MSO as PDF (170.46 KB, application/pdf)
2018-10-09 08:50 UTC, Timur
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Laubrino 2013-01-02 13:41:07 UTC
Created attachment 72382 [details]
test document

Find attached a document. The document is rendered wrong. Checked with Google docs the document has three pages only. But libre office shows a mess.
Comment 1 Rainer Bielefeld Retired 2013-01-02 14:22:55 UTC
LibO sees first 34 table rows as table heading (i think there is no table heading to be repeated at all), I believe this table problem causes the page number problem.

AOOo 3.4.1 shows the same problem, so I think this problem is inherited from OOo.

@Laubrino:
Please find out an contribute info concerning your 
- LibO Version, localization, UI language
- OS Version, localization, UI language
Are you sure that you have permission to publish that document here?
Comment 2 Laubrino 2013-01-08 08:29:56 UTC
LibreOffice version 3.6.3. We use it in headless mode, so I'm not sure about localization/language.
Tested on Windows 7.
Godd question, we have permission to publish the document.
Thank you
Comment 3 bfoman (inactive) 2013-07-08 13:16:42 UTC
Confirmed with:
LO 4.2.0.0.alfa0
Build ID: 2013-06-24 own debug build 
Windows 7 Professional SP1 64 bit

Word 2010 - 5 pages document.
LO - 28 pages document, where only 1st page is displayed... 26 times. No other pages available.
Comment 4 Xisco Faulí 2014-03-31 16:06:58 UTC
This issue is still reproducible with:
   - Libreoffice 4.1.5.3 Build ID: 1c1366bba2ba2b554cd2ca4d87c06da81c05d24
   - Libreoffice 4.2.2.1 Build ID: 3be8cda0bddd8e430d8cda1ebfd581265cca5a0f
   - Libreoffice 4.3.0.0.alpha0 Build ID: b6a43bcbbf9e9a5655fd36fd4c8ef72d585f67b0
Comment 5 Joel Madero 2015-05-02 15:41:25 UTC Comment hidden (obsolete)
Comment 6 Buovjaga 2015-06-20 13:05:44 UTC
Word viewer shows 5 pages and LibO 28.

Win 7 Pro 64-bit Version: 5.1.0.0.alpha1+
Build ID: 3ecef8cedb215e49237a11607197edc91639bfcd
TinderBox: Win-x86@62-merge-TDF, Branch:MASTER, Time: 2015-06-19_23:16:58
Locale: fi-FI (fi_FI)
Comment 7 QA Administrators 2016-09-20 10:10:21 UTC Comment hidden (obsolete)
Comment 8 Telesto 2016-12-06 20:44:28 UTC
Reproducible with:
Version: 5.4.0.0.alpha0+
Build ID: a9f56091b6422ec8c42f09b8472200ae4ab12548
CPU Threads: 4; OS Version: Windows 6.19; UI Render: default; 
TinderBox: Win-x86@42, Branch:master, Time: 2016-12-05_23:12:26
Locale: nl-NL (nl_NL); Calc: CL
Comment 9 Mark Hung 2017-03-12 04:37:53 UTC
Extract the document.xml from the test document, examine the file we can see that <tblHeader/> for all the rows were set, this matches description of Comment 9 of issue 88496, quoted below:

"What WORD seems to do in the same situation: as soon as recognise that the headers line dose not fit on the page portion reserved for table, it consider the table as with-no-header, even if in the definition this stays in place.

This rise a big problem: in WORD => v.2007 for the user is very easy to define a table with, say, 50 rows all in the header (just select full table and click one button on the ribbon). WORD will disregard this config and render the table as with no header at all. 
Pass this DOC at WRITER: table render will disrupt."

Either this issue depends on 88496 or one of them is duplicate.
Comment 10 QA Administrators 2018-03-13 03:37:12 UTC Comment hidden (obsolete)
Comment 11 Aron Budea 2018-05-12 12:27:20 UTC
Still buggy in LO 6.1 alpha1.
Comment 12 Aron Budea 2018-05-12 12:31:17 UTC
Created attachment 142050 [details]
Sample DOCX

Attaching a minimal sample. The table should continue on the next page, but is truncated instead.
Comment 13 Aron Budea 2018-05-12 12:34:18 UTC
Created attachment 142051 [details]
Sample ODT (created in Word)

ODT saved in Word behaves the same.
Comment 14 Mike Kaganski 2018-05-31 10:19:24 UTC
*** Bug 64264 has been marked as a duplicate of this bug. ***
Comment 15 Mike Kaganski 2018-06-04 07:13:44 UTC
*** Bug 88496 has been marked as a duplicate of this bug. ***
Comment 16 Mike Kaganski 2018-06-04 07:14:54 UTC
This is not a filter deficiency; so removing the keyword.
Comment 17 Mike Kaganski 2018-06-04 09:25:59 UTC
There's no need to use Word to reproduce.
Steps:

1. Create a new text document in Writer.
2. Add a table (Table->Insert Table...; Ctrl+F12) with 100 rows and 99 heading rows (the dialog doesn't allow to set the heading rows number equal or greater than total number of rows).

This creates a table which goes outside the page, and an "empty" second page is shown (with the single mandatory paragraph after the table).

3. In table properties (Table->Properties...), Text Flow tab, reduce the heading row count to 98.

This makes the table to take two pages, each one having the table heading rows visible and going outside the page (apparent if you put some text to cells), with third "empty" page. What happens is the layout code doesn't split the table by heading rows; instead, it keeps outputting heading rows (going outside of the page in the process) and one normal row, then it is able to split table, which it does (going to the next page); there, it first outputs the 98 heading rows (again, going outside of the page) and one normal row.
Comment 18 DiegoM 2018-06-04 15:12:26 UTC
As noted in bug 88496 
https://bugs.documentfoundation.org/show_bug.cgi?id=88496#c9
pass 7. :
To get the error condition is enough to produce a table where the header does not fit on the [first] page.
Comment 19 Mike Kaganski 2018-06-04 21:54:28 UTC
https://gerrit.libreoffice.org/55302
Comment 20 Aron Budea 2018-09-04 11:29:49 UTC
Not strictly related to DOC(X) formats, see comment 16 and comment 17, adjusting respective fields accordingly.
Comment 21 László Németh 2018-09-18 13:39:13 UTC
(temporary) workaround: https://gerrit.libreoffice.org/#/c/60689/
Comment 22 László Németh 2018-09-20 06:41:20 UTC
Description of the suggested (temporary) workaround:

tdf#58944 DOCX import: workaround for hidden table headers

Repeating table headers consisted of more than 10 table rows switch off table header repetition during DOCX table import to fix non-visible table content and broken tables.

Repeating header lines are not visible in MSO, if there is no space for them. OOXML (and ODF) standards don't specify this exception, and unfortunately, it's easy to create tables with invisible repeating headers in MSO, resulting OOXML files with non-standardized layout. To show the same or a similar layout in LibreOffice (instead of a broken table with invisible content), we use a reasonable 10-row limit to apply header repetition, as a workaround. Later it's still possible to switch on header repetition or create a better compatible repeating table header in Writer for (pretty unlikely) tables with really repeating headers consisted of more than 10 table rows.

Note: This workaround could help to create standard and more portable OOXML files in a mixed environment.
Comment 23 László Németh 2018-09-20 06:41:32 UTC
More information:

MS Word has got a serious problem with the isolated rows with repeating header properties, and it seems, the suggested "solution" (for example, https://www.itsupportguides.com/knowledge-base/office-2013/word-2013-table-repeat-header-row-not-working/) is to remove the isolation by setting all previous table rows to table headers.

I was able to create tables with non-working and non-modifiable tblHeader settings easily in MSO 2016 by concatenating tables with repeating tables, and it seems for me, it's hard to do the same magic (in fact, bad layout) in LO. For example, the second page of the table of one of our original documents has no repeating header in MSO because of the isolated tblHeader, but the other pages of the same table have no such problem. Removing the isolated tblHeader by LO import/export fixes the missing header problem in MSO.
Comment 24 Commit Notification 2018-10-08 17:05:14 UTC
László Németh committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=110781a3a27dffe9e6690839bdce993796a08331

tdf#58944 DOCX import: workaround for hidden table headers

It will be available in 6.2.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 25 Timur 2018-10-09 08:50:51 UTC
Created attachment 145506 [details]
test document saved in MSO as PDF

(In reply to Buovjaga from comment #6)
> Word viewer shows 5 pages and LibO 5.1 28 pages for attachment 72382 [details]
PDF from MSO attached. It's 5 pages after the fix. OK. 
(It's 2007 DOCX and if resaved in MSO 2013 it wrongfully shows 6 pages but LO shows 5).

(In reply to Aron Budea from comment #12)
> Created attachment 142050 [details]
> Attaching a sample DOCX. The table should continue on the next page, but is truncated instead.
Used to be 3 pages in LO with truncated table on 2nd and after the fix it's 2 pages, table on 1st continuing on 2nd. MSO shows 3 pages, table starting on 2nd. 


I guess this fix also solved  attachment 63889 [details] from bug 51785 and attachment 68489 [details] from Bug 55917 (same file). Table height is different (another issue) but it doesn't hang anymore. Please correct if I'm wrong. 

Also fixed attachment 138683 [details] from bug 114715, now 2 pages. 


(In reply to Mike Kaganski from comment #15)
> *** Bug 88496 has been marked as a duplicate of this bug. ***
I don't see how this is a duplicate. This one is about DOCX and that one about LO logic if table-header does fit in the page space.
Comment 26 Timur 2018-10-09 08:51:22 UTC
*** Bug 51785 has been marked as a duplicate of this bug. ***
Comment 27 Timur 2018-10-09 08:51:59 UTC
*** Bug 55917 has been marked as a duplicate of this bug. ***
Comment 28 Timur 2018-10-09 08:52:26 UTC
*** Bug 114715 has been marked as a duplicate of this bug. ***
Comment 29 Mike Kaganski 2018-10-09 08:55:13 UTC
(In reply to Timur from comment #25)
> (In reply to Mike Kaganski from comment #15)
> > *** Bug 88496 has been marked as a duplicate of this bug. ***
> I don't see how this is a duplicate. This one is about DOCX and that one
> about LO logic if table-header does fit in the page space.

Well - these *are* the duplicates; it's just the patch from comment 24 does a workaround specific for DOCX, by modifying the document's information, keeping old problem for other formats (DOC and ODF), instead of changing Writer layout logic to behave properly in that case...
Comment 30 Timur 2018-10-09 09:21:02 UTC
But shouldn't than this one be for workaround and Bug 88496 kept open for full Writer layout logic? That makes sense from bug handling perspective.
Comment 31 Mike Kaganski 2018-10-09 09:22:28 UTC
(In reply to Timur from comment #30)

If you want to close this bug, then definitely.
Comment 32 Timur 2018-10-15 10:37:40 UTC
László, is the work here done or you intend to go on? 
Please comment that: you close this one as fixed and backport to 6.1 and that we open Bug 88496 or you will deal with issues from Bug 88496?
Comment 33 László Németh 2018-12-01 10:13:13 UTC
Timur, Mike: I've added a better explanation to the fix: https://gerrit.libreoffice.org/#/c/64388/1/writerfilter/source/dmapper/DomainMapperTableManager.cxx, continuing its backport, closing this issue, reopening the suggested one. Thanks for your help and suggestion!
Comment 34 Commit Notification 2018-12-01 10:27:20 UTC
László Németh committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/+/b1dd670876cfdd3522de546e6eb4925bcde35a6b%5E%21

tdf#58944: comment DOCX table header row limit better

It will be available in 6.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.