Bug 61363 - [docx] Huge header image (not seen in MSWord) takes over page in LO.
Summary: [docx] Huge header image (not seen in MSWord) takes over page in LO.
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: filter:docx
Depends on:
Blocks: DOCX
  Show dependency treegraph
 
Reported: 2013-02-23 17:56 UTC by Florian Reisinger
Modified: 2024-05-31 17:38 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
Testkit (.zip) (2.39 MB, application/zip)
2013-02-23 17:56 UTC, Florian Reisinger
Details
PDFs showing output rendering under v3304, v3462, v3572, v3672, v4062, and v4132. (2.09 MB, application/zip)
2013-12-06 12:31 UTC, Owen Genat (retired)
Details
office docx document for import testing (1.21 MB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2014-06-03 11:39 UTC, jean-paul
Details
tdf61363_hiddenImage.docx: reduced example, modified in MSWord 2003 (41.03 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2017-11-02 11:36 UTC, Justin L
Details
tdf61363_hiddenImage-Word2003.pdf: it is an empty-looking page in MS Word (4.72 KB, application/pdf)
2021-03-26 18:56 UTC, Justin L
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Florian Reisinger 2013-02-23 17:56:07 UTC
Created attachment 75417 [details]
Testkit (.zip)

Download Testkit

1) Original document and output of 
LibO 3.5.5.3 Portable | 4.0.0.3 | 4.0.1.1 | 4.1.0.0.alpha0+ | Word 2010 tested on Windows 7 x64:

Compare (for example... Other things are more obvious):
Number of pages:
3553: 15
4003: 54
4011: 54
4100a+ : 47
Word 2010: 14

Page of the list of content:
3553: 4
4003: 1
4011: 1
4100a+ : 1
Word 2010: 2

Images on page 1:
3553: 2 (false one)
4003: 3
4011: 3
4100a+ : 3
Word 2010: 2

On which page is the yellow picture (P1 Word 2010)
3553: 3
4003: 1 (On top of list of content...)
4011: 1 (On top of list of content...)
4100a+ : 1 (On top of list of content...)
Comment 1 Florian Reisinger 2013-02-23 17:57:56 UTC Comment hidden (obsolete)
Comment 2 Florian Reisinger 2013-02-23 17:59:42 UTC Comment hidden (obsolete)
Comment 3 Owen Genat (retired) 2013-12-06 12:31:30 UTC
Created attachment 90353 [details]
PDFs showing output rendering under v3304, v3462, v3572, v3672, v4062, and v4132.

I have tested rendering on screen and export to PDF of the provided DOCX under Ubuntu 10.04 x86_64 running:

- v3.3.0.4 OOO330m19 Build: 6
- v3.4.6.2 OOO340m1 Build: 602
- v3.5.7.2 Build ID: 3215f89-f603614-ab984f2-7348103-1225a5b
- v3.6.7.2 Build ID: e183d5b
- v4.0.6.2 Build ID: 2e2573268451a50806fcd60ae2d9fe01dd0ce24
- v4.1.3.2 Build ID: 70feb7d99726f064edab4605a8ab840c50ec57a

Resultant PDFs are attached to add to the initial set. Number of pages: (a) initially indicated in status bar upon opening document; (b) after waiting for pagination update; (c) after paging to document end:

v3.3.0.4: (a) 19; (b) 19; (c) 15.
v3.4.6.2: (a) 19; (b) 19; (c) 19.
v3.5.7.2: (a) 19; (b) 15; (c) 15.
v3.6.7.2: (a) 18; (b) 55; (c) 55.
v4.0.6.2: (a)  8; (b) 54; (c) 54.
v4.1.3.2: (a) 19; (b) 49; (c) 49.

On screen rendering, as originally reported, is and remains a mess.
Comment 4 Owen Genat (retired) 2013-12-06 12:34:34 UTC
As a result of comment #3, confirmed. Status set to NEW. Version set to Inherited From OOo as even v3.3.0.4 exhibits filter issues.
Comment 5 jean-paul 2014-06-03 11:39:37 UTC
Created attachment 100354 [details]
office docx document for import testing
Comment 6 jean-paul 2014-06-03 12:00:01 UTC
Attached .docx document can't be correctly opened with libreoffice version 4.2.x
Viewed document has only one page corresponding to nothing

Attached .docx document can be open with libreoffice version 3.5.5 and can be saved as .docx, .doc or .odt document

Saved .docx has significant differences with original one, including page number when opened with libreoffice 4.2, as it has with libreoffice 3.5
Open page count updates when browsing to the end of document.

Saved .doc has no page count bug, and look fine when opened with LO 3.5. Still has some formating issues with LO 4.2

Saved .odt (with LO 3.5.5) looks like .doc with both LO version
Comment 7 Florian Reisinger 2014-06-04 15:01:54 UTC
Comment on attachment 100354 [details]
office docx document for import testing

Tested on Win7 x64 with Version: 4.4.0.0.alpha0+
Build ID: c0190efe9e2e27bd60fbf7e35a698e1e3c4ef77c
TinderBox: Win-x86@39, Branch:master, Time: 2014-06-02_07:51:43

Tested with doc from testkit.zip
Number of pages: (a) initially indicated in status bar upon opening document; (b) after waiting for pagination update; (c) after paging to document end:
a)19 b)19 (scrolling down a bit:49) c)49
Page of the list of content: 2-4
Images on page 1: 2
On which page is the yellow picture (P1 Word 2010): 1
https://bugs.freedesktop.org/attachment.cgi?id=100354 has 1 page in LibO and 32 in Word and looks totally different, would be nice to have a own bug for that
(opened #79639, marked attachment obsolete)
Comment 8 QA Administrators 2015-06-08 14:42:40 UTC Comment hidden (obsolete)
Comment 9 Buovjaga 2015-08-02 10:19:24 UTC
Now the first page is ok, but the header image problem and the page splitting problem remain.

Win 7 Pro 64-bit Version: 5.1.0.0.alpha1+
Build ID: 902255645328efde34ddf62227c8278e8dd61ff0
TinderBox: Win-x86@39, Branch:master, Time: 2015-07-30_03:52:07
Locale: en-US (fi_FI)
Comment 10 QA Administrators 2016-09-20 10:21:32 UTC Comment hidden (obsolete)
Comment 11 Justin L 2017-11-02 11:36:52 UTC
Created attachment 137453 [details]
tdf61363_hiddenImage.docx: reduced example, modified in MSWord 2003

The main problem remaining in this document is that an "as character" image in the header shows up in LO, but is hidden in MSWord.

In Word, you can see the image by going into the header, and pressing delete once.  Good luck Tamas.
Comment 12 QA Administrators 2019-02-10 03:43:59 UTC Comment hidden (obsolete, spam)
Comment 13 Justin L 2020-04-11 17:40:02 UTC
confirmed: hugh image in header seen in LO 7.0 master.
Comment 14 Justin L 2021-03-26 18:56:39 UTC
Created attachment 170772 [details]
tdf61363_hiddenImage-Word2003.pdf: it is an empty-looking page in MS Word

Yes, the document from comment 11 just looks like a normal, empty page in Word.  (I added some text to show where the relative size of the header/body.)
Comment 15 Justin L 2021-03-26 18:58:19 UTC
... and I forgot to say: repro 7.2+ (despite some fixes related to not showing images outside of document boundaries.)
Comment 16 QA Administrators 2023-03-27 03:19:53 UTC Comment hidden (obsolete, spam)
Comment 17 Justin L 2024-05-31 17:38:07 UTC
repro 24.8+