Bug 73499 - FILEOPEN: Linked Textbox Grouping Cause partially Missing Text and Messed Layout in LO Writer
Summary: FILEOPEN: Linked Textbox Grouping Cause partially Missing Text and Messed Lay...
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
4.1.4.2 release
Hardware: All All
: medium normal
Assignee: Attila Bakos (NISZ)
URL:
Whiteboard: target:7.4.0
Keywords: filter:docx
Depends on:
Blocks: DOCX-Textbox DOCX-Grouped-Shapes
  Show dependency treegraph
 
Reported: 2014-01-11 13:32 UTC by Wilson
Modified: 2022-03-31 09:21 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments
text get missing in older doc file (208.46 KB, image/jpeg)
2014-01-11 13:32 UTC, Wilson
Details
MSO 2007 linked textbox sample (16.66 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2014-01-11 14:10 UTC, Wilson
Details
MOS2007 linked textbox sample (doc) (28.00 KB, application/msword)
2014-01-11 14:15 UTC, Wilson
Details
MSO 2003 Linked Textbox sample (25.50 KB, application/msword)
2014-01-11 15:23 UTC, Wilson
Details
tdf73499.6alpha1.pdf - PDFs of the three test docs still incorrectly rendered by LO6.0 alpha1 (67.68 KB, application/pdf)
2017-10-20 06:28 UTC, Justin L
Details
How it looks in Writer 7.4 (resaved in mso) (253.49 KB, image/jpeg)
2022-02-21 14:21 UTC, Attila Bakos (NISZ)
Details
The bugdoc resaved version (38.33 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2022-02-21 14:23 UTC, Attila Bakos (NISZ)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Wilson 2014-01-11 13:32:47 UTC
Created attachment 91859 [details]
text get missing in older doc file

As you can see, both MSO and LO support linked textbox, where text can flow from one box into the other. With doc files, everything works fine if not perfect, until the textboxes are GROUPED. Depending on the MSO version, the file in question is saved with, grouped and linked text boxes may get UNLINKed in different ways, in some case the text is missing in a unrecoverable fashion, which is worse than messed layout.

For doc files created from MSO 2003 and previous,
the text flows are broken into corresponding grouped but unlinked text boxes in LO. This could have been acceptable for viewers. But this breaking process consumes 1 character each time it breaks a text flow, which means every previously linked textbox got one trailing character missing, except the last box. And there doesn't seem to be a way to recover this missing text in LO along. (see attachment for example )

For doc files generated by MSO 2007 (and above?), Grouped Linked Textbox got unlinked, and all the text is crammed into the first textbox in LO. Other textbox got empty. Besides the first textbox is expanded vertically for as long as necessary to avoid text overflow, which messed up the layout.

With docx file , all linked textbox are broken in LO. The Grouped ones are merged and enlarged horizontally in a uncontrolled way beyond the page border. Not grouped but linked ones become empty except the first textbox with overflowed text hidden.

As you may have noticed in daily publications, linked textbox is a very basic and common tool in desktop publishing. And missing text is a real deal breaker. 
We really need it to work correctly. Please fix this feature in doc format.

Thank you,
Wilson.
Comment 1 Wilson 2014-01-11 14:10:56 UTC
Created attachment 91860 [details]
MSO 2007 linked textbox sample
Comment 2 Wilson 2014-01-11 14:15:08 UTC
Created attachment 91861 [details]
MOS2007 linked textbox sample (doc)
Comment 3 Wilson 2014-01-11 15:23:58 UTC
Created attachment 91863 [details]
MSO 2003 Linked Textbox sample

English text looks alright, since only white space and line-breaker is bitten off.
But when double byte Characters get bitten off like with white-space and paragraph line-breaker, we start missing text with CJK lanugage.
Comment 4 Joel Madero 2014-06-02 20:10:21 UTC Comment hidden (obsolete)
Comment 5 Buovjaga 2014-10-31 21:43:26 UTC
I reproduce with the doc & docx.

Win 7 64-bit Version: 4.4.0.0.alpha1+
Build ID: 56019dcb79475606952a954fe732a3109441ffec
TinderBox: Win-x86@39, Branch:master, Time: 2014-10-30_07:27:11
Comment 6 Buovjaga 2014-12-16 11:25:41 UTC Comment hidden (obsolete)
Comment 7 Cor Nouws 2014-12-21 20:16:32 UTC Comment hidden (obsolete)
Comment 8 Justin L 2014-12-22 05:16:14 UTC
In Bug 87348 I marked it as pre-bibisect.   For the .docx part of this bug at least, it appears that it has never worked.

Just verified (using bug 87348 sample LinkedFramesBug.docx) that docx doesn't flow on versions:
-oldest (3.5.0): doesn't really support textboxes at all.
-last36onmaster: 3 textboxes shown - all text in the first box
-last40onmaster: no textbox or text seen at all
-last41onmaster: same as 3.6
-last42onmaster: same as 3.6
-last43onmaster: same as 3.6
Comment 9 QA Administrators 2016-01-17 20:03:03 UTC Comment hidden (obsolete)
Comment 10 Justin L 2017-10-20 06:28:38 UTC
Created attachment 137131 [details]
tdf73499.6alpha1.pdf - PDFs of the three test docs still incorrectly rendered by LO6.0 alpha1
Comment 11 QA Administrators 2018-10-21 02:50:22 UTC Comment hidden (obsolete)
Comment 12 Luke 2018-10-21 16:01:29 UTC
The issue with textboxes not grouped fixed in bug 87348. Grouped textboxes are still not working in Version: 6.2.0.0.alpha0+ (x64)
Build ID: 5e8cd8683d345b75297994b3f7aab851835eb124
Comment 13 Luke 2018-10-21 16:30:19 UTC
Let's track the .docx issue here. The .doc issue exposed by attachment 91861 [details] was spun off into Bug 120755.
Comment 14 Attila Bakos (NISZ) 2022-02-21 14:21:52 UTC
Created attachment 178438 [details]
How it looks in Writer 7.4 (resaved in mso)

This ticked has two cases:
1) VML groupshape (v:group) with linked feature >> still missing feature, but VML deprecated
2) DrawingML equivalent (resaved with word 19) >> Works with patch, as in my screenshot (Gerrit: https://gerrit.libreoffice.org/c/core/+/130283)
Comment 15 Attila Bakos (NISZ) 2022-02-21 14:23:01 UTC
Created attachment 178439 [details]
The bugdoc resaved version
Comment 16 László Németh 2022-03-29 14:28:12 UTC
Fixed for DOCX in https://gerrit.libreoffice.org/c/core/+/130950. For DOC, there is a workaround to convert it to DOCX/DrawingML by MSO, so I suggest to close this issue, and file a new one for DOC, if needed.

@Attila: thanks for the fix!

@Wilson, Buovjaga, Justin, Luke: thanks for the bug report and bug handling!
Comment 17 Commit Notification 2022-03-29 14:30:51 UTC
Attila Bakos (NISZ) committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/f81800193a942b3f68c61a5cede634f3eeb47b1f

tdf#73499 DOCX import: fix grouped linked textbox

It will be available in 7.4.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 18 NISZ LibreOffice Team 2022-03-31 09:21:35 UTC
Verified in:
Version: 7.4.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: a3988b2d147a2442b348d58b79dbd6e71472b7af
CPU threads: 4; OS: Windows 10.0 Build 18363; UI render: Skia/Raster; VCL: win
Locale: hu-HU (hu_HU); UI: en-US
Calc: threaded