Bug 99616 - FILEOPEN DOCX Long text in 1-cell table is showing only 2 lines
Summary: FILEOPEN DOCX Long text in 1-cell table is showing only 2 lines
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
4.4.0.3 release
Hardware: All All
: medium normal
Assignee: Justin L
URL:
Whiteboard: target:5.4.0 target:5.3.0.2
Keywords: bibisected, bisected, filter:docx, regression
Depends on:
Blocks: DOCX-Tables
  Show dependency treegraph
 
Reported: 2016-05-01 20:47 UTC by Steven Li
Modified: 2017-03-07 12:46 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
DOCX file, with a single-cell table, containing long text (26.74 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2016-05-01 20:47 UTC, Steven Li
Details
Screen shot of how the doc looks in LO Writer (185.38 KB, image/png)
2016-05-01 20:48 UTC, Steven Li
Details
Screen shot of how the doc looks in MSWORD (177.43 KB, image/png)
2016-05-01 20:49 UTC, Steven Li
Details
aTablePage.docx: a portion of attachment 120911 which is used as a unit test for bug 75573 (14.42 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2016-12-21 07:20 UTC, Justin L
Details
aTablePage.pdf: export from MSWord 2013 (296.91 KB, application/pdf)
2016-12-22 08:30 UTC, Justin L
Details
3rd paragraph is incompletee (90.75 KB, image/png)
2017-03-07 08:20 UTC, vihsa
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Steven Li 2016-05-01 20:47:34 UTC
Created attachment 124767 [details]
DOCX file, with a single-cell table, containing long text

I received a rather simple document, with text written in a single-cell table. I supposed it is a crude way to apply paragraph border, but the format itself is legitimate.

The long text is showing up in LOWriter with a much shorter table, displaying only 2 lines of text.

Enclosed please find the original DOCX file, and the screen shots of how it looks in MSWORD and LOWriter.
Comment 1 Steven Li 2016-05-01 20:48:49 UTC
Created attachment 124768 [details]
Screen shot of how the doc looks in LO Writer
Comment 2 Steven Li 2016-05-01 20:49:34 UTC
Created attachment 124769 [details]
Screen shot of how the doc looks in MSWORD
Comment 3 Buovjaga 2016-05-06 13:45:59 UTC
Confirmed.

Arch Linux 64-bit, KDE Plasma 5
Version: 5.2.0.0.alpha1+
Build ID: 540fee2dc7553152914f7f1d8a41921e765087ef
CPU Threads: 8; OS Version: Linux 4.5; UI Render: default; 
Locale: fi-FI (fi_FI.UTF-8)
Built on April 30th 2016
Comment 4 Telesto 2016-11-28 17:08:15 UTC
Confirming with:
Version: 5.4.0.0.alpha0+ (x64)
Build ID: 7aa2b5a041df8e71a435cccbc79ee13799ec9138
CPU Threads: 4; OS Version: Windows 6.19; UI Render: default; Layout Engine: new; 
TinderBox: Win-x86_64@62-TDF, Branch:MASTER, Time: 2016-11-24_11:40:27
Locale: nl-NL (nl_NL); Calc: CL
Comment 5 Telesto 2016-12-12 15:03:51 UTC
Looks a bit like a regression, but not a perfect one.

Found in:
Version: 5.4.0.0.alpha0+
Build ID: 84f2ff67a7e404febf710b1dc7f66d06745c503f
CPU Threads: 4; OS Version: Windows 6.19; UI Render: default; 
TinderBox: Win-x86@42, Branch:master, Time: 2016-12-09_23:20:01
Locale: nl-NL (nl_NL); Calc: CL

and in

Version: 4.4.0.3
Build ID: de093506bcdc5fafd9023ee680b8c60e3e0645d7
Locale: nl_NL

but not in the same way in (but still way off)
Version: 4.3.0.4
Build ID: 62ad5818884a2fc2e5780dd45466868d41009ec0

and not in (quite good actually, but not perfect)
Versie: 4.1.0.4 
Build ID: 89ea49ddacd9aa532507cbf852f2bb22b1ace28
Comment 6 Xisco Faulí 2016-12-13 10:48:10 UTC
Regression introduced by:

author	Miklos Vajna <vmiklos@collabora.co.uk>	2014-08-14 11:54:18 (GMT)
committer	Miklos Vajna <vmiklos@collabora.co.uk>	2014-08-14 13:55:44 (GMT)
commit d1278ef4849661b9ae0eb7aaf4d74fbf91ccaf11 (patch)
tree 07e1c063cbd015b90c8be638197ad6e15e531b07
parent ffdc8780eba3ec34e502b01b9a54401627ee25c5 (diff)

bnc#865381 DOCX import: handle <w:hideMark> table cell property

Adding Cc: to Miklos Vajna
Comment 7 Justin L 2016-12-21 07:20:30 UTC
Created attachment 129838 [details]
aTablePage.docx: a portion of attachment 120911 [details] which is used as a unit test for bug 75573

There should be a newline paragraph between these tables.  bibisect44max took me to the commit mentioned in comment 6, so adding this as a related test.
Comment 8 Justin L 2016-12-22 08:30:31 UTC
Created attachment 129857 [details]
aTablePage.pdf: export from MSWord 2013

(In reply to Justin L from comment #7)
> There should be a newline paragraph between these tables.

Actually this is all one big table, not multiple tables.  The "newline" is one big merged cell without borders.
Comment 9 Justin L 2016-12-23 06:21:21 UTC
from the 41MB file https://www.ecma-international.org/news/TC45_current_work/tc45-2006-338.pdf

2.3.15 hideMark (Ignore End Of Cell Marker In Row Height Calculation)
This element specifies whether the end of cell glyph shall influence the height of the given table row in the table. If it is specified, then only printing characters in this cell shall be used to determine the row height.

Typically, the height of a table row is determined by the height of all glyphs in all cells in that row, including the non-printing end of cell glyph characters. However, if these characters are not formatted, they are always created with the document default style properties. This means that the height of a table row cannot ever be reduced below the size of the end of cell marker glyph without manually formatting each paragraph in that run. 

In a typical document, this behavior is desirable as it prevents table rows from 'disappearing' if they have no content. However, if a table row is being used as a border (for example, by shading its cells or putting an image in them), then this behavior makes it impossible to have a virtual border that is reasonably small without formatting each cell's content directly. This setting specifies that the end of cell glyph shall be ignored for this cell, allowing it to collapse to the height of its contents without formatting each cell's end of cell marker, which would have the side effect of formatting any text ever entered into that cell. 

If this element is omitted, then the end of cell marker shall be included in the determination of the height of this row.
Comment 10 Justin L 2016-12-23 11:07:58 UTC
(In reply to Steven Li from comment #0)
> I received a rather simple document, with text written in a single-cell
> table.
This only appears to be a simple document, and is almost impossible to re-create by hand.  Actually, there are 12 rows and not one - seen by turning on the "show formatting" in MSWord - even though they act like a single cell.

In fact, ALL of this hiding row stuff is hard to re-create. The internet says that rows are hidden by marking the font as hidden, but that doesn't seem to be true with hidemark.docx.  I haven't been successful at adding hidemark to a cleanroom document.

Apparently MSO honours the minimum row size still, while LO was forcing it to be the smallest possible size.  https://gerrit.libreoffice.org/32380 tdf#99616
Comment 11 Commit Notification 2017-01-04 08:45:02 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=1a58cdf8af1aba52ce0a376666dd7d742234d7cf

tdf#99616 writerfilter: hideMark shouldn't force min size

It will be available in 5.4.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 12 Commit Notification 2017-01-05 04:31:57 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "libreoffice-5-3":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=3d5ccc1577ff89bd13c26a8cde787a39482a8b81&h=libreoffice-5-3

tdf#99616 writerfilter: hideMark shouldn't force min size

It will be available in 5.3.0.2.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 13 vihsa 2017-03-07 08:18:21 UTC
Version: 5.4.0.0.alpha0+ / Build ID: febc116 / ls-4001 / android 5.1

the third paragraph is incomplete when compared with ms file screenshot.
Comment 14 vihsa 2017-03-07 08:20:03 UTC
Created attachment 131691 [details]
3rd paragraph is incompletee
Comment 15 Justin L 2017-03-07 09:36:56 UTC
(In reply to krihsna from comment #13)
> the third paragraph is incomplete when compared with ms file screenshot.

Krihsna,
  We normally only focus on a single bug per report. So, although the unit test may not be "perfect", the problem in the description has been corrected. Any further issues would result in a new bug report.  However, my 5.4 looks very different from yours (mine is basically correct), so likely this is a font substitution issue for you. Make sure you have the correct font installed and see if that helps before creating a new bug report.
Comment 16 vihsa 2017-03-07 12:21:27 UTC
(In reply to Justin L from comment #15)
> (In reply to krihsna from comment #13)
> > the third paragraph is incomplete when compared with ms file screenshot.
> 
> Krihsna,
>   We normally only focus on a single bug per report. So, although the unit
> test may not be "perfect", the problem in the description has been
> corrected. Any further issues would result in a new bug report.  However, my
> 5.4 looks very different from yours (mine is basically correct), so likely
> this is a font substitution issue for you. Make sure you have the correct
> font installed and see if that helps before creating a new bug report.

Justin,

please excuse, you are 100 % correct, the three paragraphs are fully visible with various font substitutions.
Comment 17 Timur 2017-03-07 12:46:24 UTC
I set to Verified.