Bug 55946 - FILEOPEN: Word document with multiple objects/frames on cover page formatted wrong
Summary: FILEOPEN: Word document with multiple objects/frames on cover page formatted ...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
3.4.2 release
Hardware: All All
: medium normal
Assignee: Justin L
URL:
Whiteboard: interoperability target:7.6.0
Keywords: filter:doc, filter:docx, preBibisect, regression
: 85309 129336 142475 (view as bug list)
Depends on:
Blocks: DOC-Frames DOC-Images
  Show dependency treegraph
 
Reported: 2012-10-13 09:01 UTC by Kalle Tuulos
Modified: 2023-05-29 21:43 UTC (History)
12 users (show)

See Also:
Crash report or crash signature:


Attachments
Zip containing PDF presenting how the Word document should look like (2.85 MB, application/zip)
2012-10-13 09:08 UTC, Kalle Tuulos
Details
Side-by-side screenshots of how a cover page looks in Word and LibreOffice. (228.28 KB, image/png)
2012-10-30 20:21 UTC, Linus Drumbler
Details
image showing how the page looks (~fine) in LO 3304 (17.98 KB, image/png)
2016-09-13 15:02 UTC, Cor Nouws
Details
LO 5.1.5.2 screendump (156.96 KB, image/png)
2016-09-14 06:44 UTC, Kalle Tuulos
Details
Screen capture from Libre Office 5.3.4.2 (171.25 KB, image/png)
2017-07-19 08:11 UTC, Kalle Tuulos
Details
Original DOC of 277 pages (6.38 MB, application/msword)
2020-03-10 11:25 UTC, Timur
Details
Original DOC with just pages 1-2 from MSO (149.00 KB, application/msword)
2020-03-10 11:30 UTC, Timur
Details
Original DOC pages 1-2 saved as DOCX in MSO (60.48 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2020-03-10 11:32 UTC, Timur
Details
Original DOC pages 1-2 as PDF from MSO (58.86 KB, application/pdf)
2020-03-10 11:33 UTC, Timur
Details
DOC page 1 compared MSO LO (153.26 KB, image/png)
2020-03-10 11:56 UTC, Timur
Details
DOCX pages 1-2 compared MSO LO (155.13 KB, image/png)
2020-03-10 12:01 UTC, Timur
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Kalle Tuulos 2012-10-13 09:01:00 UTC
3GPP documentation is a good example of Word documents, which are not at all comfortable to read with LibreOffice (or OpenOffice - this error has been there "always").

Attached is 3GPP specification 25.133 v11.2.0 as both original Word document and as PDF, which is exported from MS Word 2010. The original Word document can also be fetched from here: http://www.3gpp.org/ftp/Specs/2012-09/Rel-11/25_series/25133-b20.zip

If the Word document is opened in any MS Word installation (in almost any computer, with almost any Word version), the document looks the same. Cover page looks good, page numbers stay the same. The attached PDF represents how the document should look like.

But when the same document is opened in LibreOffice (for example 3.6.1.2, Build ID e29a214), the cover page formatting is messed up, and also other pages are formatted slightly differently. This causes e.g. page numbers to run slightly differently than in Word. I remember that these formatting problems have appeared always in all OpenOffice versions, with all 3GPP documents. Thus it is not comfortable to read 3GPP documents with LibreOffice and companies working with 3GPP related development have to stick with MS Word.

The attached ZIP file (25133-b20.zip) contains following:
- 3GPP specification document in both original Word format
- PDF document created with MS Word showing how the document should look like
Comment 1 Kalle Tuulos 2012-10-13 09:06:45 UTC Comment hidden (obsolete)
Comment 2 Kalle Tuulos 2012-10-13 09:08:35 UTC
Created attachment 68513 [details]
Zip containing PDF presenting how the Word document should look like
Comment 3 Linus Drumbler 2012-10-30 20:21:41 UTC
Created attachment 69332 [details]
Side-by-side screenshots of how a cover page looks in Word and LibreOffice.

Reproduced with Windows 7 and version 3.6.2.2, see attachment. I was actually thinking about submitting this bug a while ago.
Comment 4 Linus Drumbler 2012-10-30 22:39:11 UTC Comment hidden (obsolete)
Comment 5 Linus Drumbler 2013-01-06 00:03:00 UTC Comment hidden (no-value)
Comment 6 Linus Drumbler 2013-01-21 20:19:26 UTC Comment hidden (no-value)
Comment 7 Petr Mladek 2013-01-23 15:08:26 UTC
It is rather a feature request.
Comment 8 Linus Drumbler 2013-01-23 20:20:06 UTC Comment hidden (obsolete)
Comment 9 Cédric Bosdonnat 2014-01-20 08:57:48 UTC Comment hidden (obsolete)
Comment 10 Anders Ripa 2014-01-30 19:58:26 UTC
The problem with 3GPP files still exist in current version, another example
is http://www.3gpp.org/ftp/Specs/archive/33_series/33.102/33102-b00.zip
Comment 11 tommy27 2014-10-22 16:21:17 UTC
*** Bug 85309 has been marked as a duplicate of this bug. ***
Comment 12 tommy27 2014-10-22 16:22:36 UTC
tested under Win7x64.

bug not present in LibO 3.3.4
bug present in LibO 3.4.3 and following releases including 4.3.2 and recent 4.4.x master.

compare screenshots in attachment 108217 [details]

edited version field
Comment 13 Robinson Tryon (qubit) 2015-12-10 01:26:40 UTC Comment hidden (obsolete)
Comment 14 Jean-Baptiste Faure 2016-04-15 10:44:06 UTC
According to comment #12, this is not an enhancement request.

In LO 5.1.3.0.0+, I see 279 pages instead of 277 in the pdf. Nothing that prevent to work with this document. The only real problem is the cover page which is messed up. Same problem in current master.
Tested under Ubuntu 15.10 x86-64.

Best regards. JBF
Comment 15 Xisco Faulí 2016-09-13 10:55:04 UTC Comment hidden (obsolete)
Comment 16 Cor Nouws 2016-09-13 15:02:53 UTC
Created attachment 127298 [details]
image showing how the page looks (~fine) in LO 3304
Comment 17 Kalle Tuulos 2016-09-14 06:44:01 UTC
Created attachment 127326 [details]
LO 5.1.5.2 screendump

Created a new attachment showing the document cover page on Libre Office writer, version 5.1.5.2. The problem still exists, sorry.
Comment 18 Kalle Tuulos 2017-07-19 08:11:14 UTC
Created attachment 134726 [details]
Screen capture from Libre Office 5.3.4.2

Just bumping this up with a new screen capture from LibreOffice 5.3.4.2.
The error exists still.
Comment 19 QA Administrators 2018-10-02 02:52:51 UTC Comment hidden (obsolete)
Comment 20 ni shengyue 2019-12-12 02:52:00 UTC
*** Bug 129336 has been marked as a duplicate of this bug. ***
Comment 21 ni shengyue 2019-12-12 02:54:53 UTC Comment hidden (obsolete)
Comment 22 Timur 2020-03-10 11:25:33 UTC
Created attachment 158544 [details]
Original DOC of 277 pages

Since original DOC is no longer available through link, I add it here. 
Various LO versions open different number of pages. It's 274 in 7.0+ and it opens rather slow.
Comment 23 Timur 2020-03-10 11:30:58 UTC
Created attachment 158545 [details]
Original DOC with just pages 1-2 from MSO

Makes no sense to open 277 pages just to show cover. 
Here is DOC with just pages 1-2 saved in MSO.
Comment 24 Timur 2020-03-10 11:32:21 UTC
Created attachment 158546 [details]
Original DOC pages 1-2 saved as DOCX in MSO

Let's also keep and DOCX, although bug is about DOC.
Comment 25 Timur 2020-03-10 11:33:16 UTC
Created attachment 158547 [details]
Original DOC pages 1-2 as PDF from MSO
Comment 26 Timur 2020-03-10 11:56:23 UTC
Created attachment 158548 [details]
DOC page 1 compared MSO LO

As can be seen, OO and LO 3.4.0 were not perfect, but were spoiled in 3.4.x and never fixed up to 7.0+.
Comment 27 Timur 2020-03-10 12:01:29 UTC
Created attachment 158549 [details]
DOCX pages 1-2 compared MSO LO

DOCX is worse, LO opens 3 pages of 2 pages sample all the way from LO 3.3.
Comment 28 Justin L 2020-04-17 10:35:23 UTC
These shortened page 1/2 documents consist almost entirely of frames. So this will be frame anchoring problems, perhaps related to overlap.

There is also a page break here, to really mess you up if you substitute a relative-to-page anchor to something else.
Comment 29 Timur 2021-05-25 09:16:26 UTC Comment hidden (obsolete)
Comment 30 Banibrata Dutta 2021-05-25 10:54:57 UTC Comment hidden (obsolete)
Comment 31 Timur 2021-05-25 11:23:44 UTC
(In reply to Banibrata Dutta from comment #30)
> (In reply to Timur from comment #29)
> > *** Bug 142475 has been marked as a duplicate of this bug. ***
> 
> So it is such a long running issue and not a recent regression ?

You reported DOC with screenshot attachment 172314 [details]. And previous situation is in screenshot attachment 158548 [details].
It looks worse but I don't see it was even OK and it's from Linux (I see it in 5.2).
Comment 32 QA Administrators 2023-05-26 03:18:01 UTC Comment hidden (obsolete, spam)
Comment 33 Justin L 2023-05-26 16:08:00 UTC
Focusing on the main issue, where the top frame drops down over other text, that actually looked good in 3.4.1, but not in 3.4.2.

All duplicates really are 3GPP documents.

The 3GPP picture placement on this one is interesting. in MS Word 2010 I see it on the right side, but the "box" is huge - extending from the left edge of the page. Looking at the image properties doesn't show anything unusual, but pressing OK makes the image jump to the left side of the page (where LO used to show it before LO 7.4).

It changed in LO 7.4 with commit 1539a0fcd31f4ba7ef71adf4ae7761dc445199f5
Author: Justin Luth on Sat Aug 13 12:29:12 2022 -0400
    tdf#77964 doc import: 0x1 placeholder is for AS_CHAR
Comment 34 Commit Notification 2023-05-26 19:53:09 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/d42f519be7bcc8e194899a6b3225bcee7e54bc16

tdf#60683 tdf#55946 doc import: use style's anchor info

It will be available in 7.6.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 35 Justin L 2023-05-29 21:43:56 UTC
(In reply to Justin L from comment #33)
> The 3GPP picture placement on this one is interesting. in MS Word 2010 I see
> it on the right side, but the "box" is huge - extending from the left edge
> of the page. Looking at the image properties doesn't show anything unusual,
> but pressing OK makes the image jump to the left side of the page (where LO
> used to show it before LO 7.4).
A simple round-trip doesn't fix it, but if I add one character into the frame and resave, then it looks OK in LO. (Oh, but if I remove that character, then it looks bad again...)

The cursor in the frame that I assume belongs to the image has a huge cursor in it. Likely typing a character in her properly sizes the paragraph. I have to assume I just exposed a layout bug here. It has all the markings of layout issues anyway.