Bug 31914 - FILEOPEN: incorrectly formatting 2007 DOCX; missing first page text shape content with Content Controls
Summary: FILEOPEN: incorrectly formatting 2007 DOCX; missing first page text shape con...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: low normal
Assignee: Not Assigned
URL:
Whiteboard: interoperability
Keywords: filter:docx
Depends on:
Blocks: DOCX-Textbox
  Show dependency treegraph
 
Reported: 2010-11-25 03:21 UTC by Marwan Gedeon
Modified: 2023-03-15 21:01 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
ReferenceDocument.docx (32.31 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2010-11-25 03:21 UTC, Marwan Gedeon
Details
Microsoft_vs_OO_vs_Symphony.png (176.76 KB, image/png)
2010-11-25 03:22 UTC, Marwan Gedeon
Details
first page in daily 20160420 (8.08 KB, image/png)
2016-04-21 20:55 UTC, Cor Nouws
Details
ReferenceDocument.docx saved in MSO 2013 (43.14 KB, application/vnd.openxmlformats-officedocument.wordprocessingml)
2018-01-09 12:01 UTC, Timur
Details
screen print of first three pages in MsWord 2010 (129.73 KB, image/png)
2019-01-10 10:40 UTC, Cor Nouws
Details
screen print of first three pages in LibreOffice master 63 20190106 (149.48 KB, image/png)
2019-01-10 10:41 UTC, Cor Nouws
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Marwan Gedeon 2010-11-25 03:21:14 UTC
Created attachment 40567 [details]
ReferenceDocument.docx

the attached document is not having its first page read, and the columns formatting is bad, causing wrapping in many locations. This is a DOCX document from MS.
Same formatting issue is showing on Go-OO 3.2, but the first page is coming in. Attached as well screenshot comparing MS vs. Go-OO_3.2 vs. IBM Lotus Symphony.
Comment 1 Marwan Gedeon 2010-11-25 03:22:46 UTC
Created attachment 40568 [details]
Microsoft_vs_OO_vs_Symphony.png

comparison between the 3 document editors
Comment 2 Don't use this account, use tml@iki.fi 2010-11-25 03:36:49 UTC Comment hidden (obsolete)
Comment 3 LeMoyne Castle 2011-06-12 18:04:41 UTC
Verified in 32bit Ubuntu 10.04 in recent build of 3.4 and in  
LibreOffice 3.3.2 -- OOO330m19 (Build:202)  --  tag libreoffice-3.3.2.2


First page missing and currency values wrapping in last column
Comment 4 Björn Michaelsen 2011-12-23 11:34:42 UTC Comment hidden (obsolete)
Comment 5 Rainer Bielefeld Retired 2012-01-01 22:19:15 UTC
The problem has been confirmed by LeMoyne Castle 
Status of this bug report has been modified wrongly by a bulk change.

@Cédric:
Please feel free to reassign (or reset Assignee to default) if it’s not your area or if provided information is not sufficient. Please set Status to ASSIGNED if you accept this Bug.
Comment 6 Rainer Bielefeld Retired 2012-01-01 22:30:41 UTC
Opening the sample document I still see several probelems with Parallel Dev-Installation of  "LibreOffice 3.5.0 Beta2 - WIN7 Home Premium (64bit) English UI [Build-ID : 8589e48-760cc4d-f39cf3d-1b2857e-60db978].

Most visible problem is the cruelly damaged first page, the picture is missing.
Comment 7 Jorendc 2013-02-18 09:34:44 UTC
Still reproducible using Linux Mint 14 x64 with LibreOffice 4.0.0 and master Version 4.1.0.0.alpha0+ (Build ID: 07ee72672e6966dafccf21ca3349e428c2a9dd0).

I mark this bug following [1] as 'Major medium'.

Kind regards,
Joren

[1] https://wiki.documentfoundation.org/images/0/06/Prioritizing_Bugs_Flowchart.jpg

PS: please do not alter the version of LibreOffice of this bug report. The version mentioned in this bug should be the oldest version that can reproduce this behavior.
Comment 8 tommy27 2014-06-29 09:22:01 UTC Comment hidden (obsolete)
Comment 9 tommy27 2014-06-29 09:22:13 UTC Comment hidden (obsolete)
Comment 10 tommy27 2014-10-05 13:22:25 UTC
retested under Win7x64
same issue with  missing text in first page with LibO 4.3.2.2 and 4.4.0.0.alpha0+
Build ID: 268b9c10c9ff27c74678ace99762f28d58d33012
TinderBox: Win-x86@42, Branch:master, Time: 2014-10-02_23:35:24
Comment 11 QA Administrators 2014-10-23 17:31:36 UTC Comment hidden (obsolete)
Comment 12 QA Administrators 2015-12-20 16:09:44 UTC Comment hidden (obsolete)
Comment 13 Cor Nouws 2016-04-21 20:55:24 UTC
Created attachment 124558 [details]
first page in daily 20160420

result first page in Version: 5.2.0.0.alpha0+
Build ID: e8425c48102321d4f5a8bd687c8ca1ac7bae797e
CPU Threads: 2; OS Version: Linux 4.2; UI Render: default; 
TinderBox: Linux-rpm_deb-x86@71-TDF, Branch:master, Time: 2016-04-20_16:49:44
Locale: nl-NL (nl_NL.UTF-8)
Comment 14 Telesto 2016-11-24 15:09:36 UTC
Confirming with:
Version: 5.3.0.0.alpha1+
Build ID: f965a629fba10ecba7bad938a0c1c9c3db1e510d
CPU Threads: 4; OS Version: Windows 6.2; UI Render: default; Layout Engine: new; 
TinderBox: Win-x86@62-merge-TDF, Branch:MASTER, Time: 2016-11-23_00:13:10
Locale: nl-NL (nl_NL); Calc: CL
Comment 15 QA Administrators 2017-12-08 08:08:43 UTC Comment hidden (obsolete)
Comment 16 Timur 2018-01-09 12:01:11 UTC
Created attachment 138994 [details]
ReferenceDocument.docx saved in MSO 2013

In original older DOCX first page content is text shape with Content Controls.
If saved as newer DOCX then Content Controls are not available as such in LO but text is there.
So we can confirm the bug but I set it to Low and I don't really expect it to be solved soon. 
https://nikpatel.net/2010/04/11/add-content-controls-in-the-word-documents-for-the-open-xml-automation/
Comment 17 QA Administrators 2019-01-10 03:52:53 UTC Comment hidden (obsolete)
Comment 18 Cor Nouws 2019-01-10 10:18:47 UTC
still an issue in Version: 6.3.0.0.alpha0+
Build ID: 3bf82348bc73797fec61997dc4268a322299b3ff
CPU threads: 4; OS: Linux 4.15; UI render: default; VCL: gtk3; 
TinderBox: Linux-rpm_deb-x86_64@86-TDF, Branch:master, Time: 2019-01-06_08:29:45
Locale: nl-NL (nl_NL.UTF-8); UI-Language: en-US
Calc: threaded
Comment 19 Cor Nouws 2019-01-10 10:40:55 UTC
Created attachment 148202 [details]
screen print of first three pages in MsWord 2010
Comment 20 Cor Nouws 2019-01-10 10:41:29 UTC
Created attachment 148203 [details]
screen print of first three pages in LibreOffice master 63 20190106
Comment 21 QA Administrators 2021-01-10 03:48:28 UTC Comment hidden (obsolete)
Comment 22 Jan Švanda 2022-05-28 20:45:25 UTC
Still present in:
Version: 7.4.0.0.alpha1+ (x64) / LibreOffice Community
Build ID: bbec710bd25fc5da27636cde73fe4ab23c76904f
CPU threads: 12; OS: Windows 10.0 Build 19043; UI render: Skia/Vulkan; VCL: win
Locale: cs-CZ (cs_CZ); UI: en-US
Calc: CL

The columns formatting issues seems mostly gone, but the text on the first page of the original reference document is still missing. The text on the resaved document is present, but without any formatting.
Comment 23 Justin L 2023-03-15 21:01:09 UTC
Yeah - we can't put content controls into an editeng textbox.