Bug 84462 - [filesave]: docx Filter, doc-Filter and LibreOffice
Summary: [filesave]: docx Filter, doc-Filter and LibreOffice
Status: RESOLVED DUPLICATE of bug 48741
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: Other All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: DOC-Limitations DOCX-Limitations
  Show dependency treegraph
 
Reported: 2014-09-29 13:13 UTC by Thomas Krumbein
Modified: 2016-12-09 10:45 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
the orginal OpenDocument File (421.65 KB, application/vnd.oasis.opendocument.text)
2014-09-29 13:13 UTC, Thomas Krumbein
Details
Reopen doc-File in LibreOffice (103.92 KB, image/png)
2014-09-29 13:14 UTC, Thomas Krumbein
Details
REopen docx file in LibreOffice (105.61 KB, image/png)
2014-09-29 13:14 UTC, Thomas Krumbein
Details
Reopen odt File - and original View. (111.44 KB, image/png)
2014-09-29 13:15 UTC, Thomas Krumbein
Details
PDF exports from MS Word 2007 & 2013 for both DOC and DOCX (189.99 KB, application/zip)
2014-09-30 01:18 UTC, Yousuf Philips (jay) (retired)
Details
German Description on how the original Document was created (1.55 MB, application/pdf)
2014-09-30 05:32 UTC, Thomas Krumbein
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas Krumbein 2014-09-29 13:13:08 UTC
Created attachment 107064 [details]
the orginal OpenDocument File

Both docx-filters (MS Office XML and OfficeOpen XML) do not really works with an clean formated LibO Template.

One Example attached: This is a German Letterhead with an first Page and a following Page (both realized with Page-Templates).
First Page do have some Frames (Text-) all ankered on Page, an small grafic and diverrent Text-Styles.
The original odt File was saved als docx (both ways) and doc File - and later reopen in LibreOffice (Version 4.2.4 and Version 4.3.1)
Pictures of results attached.

Main topic: None of those exportfilters could handle two diernt page-Styles, docx filter cannot handle Textframe or pictures.

Try to view documents in an Word-Viewer (MS Viewer) - docx couldn´t open either all, doc looks similar as attached picture.
Comment 1 Thomas Krumbein 2014-09-29 13:14:04 UTC
Created attachment 107065 [details]
Reopen doc-File in LibreOffice
Comment 2 Thomas Krumbein 2014-09-29 13:14:50 UTC
Created attachment 107066 [details]
REopen docx file in LibreOffice
Comment 3 Thomas Krumbein 2014-09-29 13:15:33 UTC
Created attachment 107067 [details]
Reopen odt File - and original View.
Comment 4 Terrence Enger 2014-09-29 19:31:42 UTC
Thomas,

Thank you for taking the time to report your problem and for sharing
your file with us.


Jay,

I am adding you tthe cc of this report because you are our expert on
compatibility with MS Word formats.

As I look at this report and the attachments, my first thought is that
it must be broken down into smaller reports before we can make
progress.  But, a report for each thing I see seems extreme.

(1) Comparing the attachment "Reopen odt File - and original View." to
    the attachment "Reopen doc-File in LibreOffice":

    (a) Three lines of page heading on page 1 changed from full
        justification to left justification.

    (b) The inside address lost vertical space before.

    (c) Page number at lower right of page 1 moved farther down and
        right.

    (d) Horizontal rule and remainder of tooting on page 1 moved up.

    (e) Second page lost the logo.

    (f) Text on page 2 moved down about 8 cm.

    (g) On second page, the minimal footer is replaced by the bigger
        footer from page 1.

(2) Comparing the attachment "Reopen odt File - and original View." to
    the attachment "REopen docx file in LibreOffice":

    (a) On page 1, most of the header--except for the logo--moved
        down, becoming text boxes in the body of the page.

    (b) Those text pages retain approximately their positions relative
        to each other, but they are moved right on the page, perhaps
        by the width of the page margin.

    (c) Page 2 loses the logo from the page heading.

    (d) The body of page 2 is moved down about 8 cm.

    (e) On second page, the minimal footer is replaced by the bigger
        footer from page 1.

(3) Comparing the attachment "Reopen odt File - and original View." to
    what I see opening the the originzl .odt in the LibreOffice from
    the daily dbgutil bibisect repository, version 2014-09-29:

    (a) The inside address has acquired a blue background.

Advice welcome.

Terry.
Comment 5 Cor Nouws 2014-09-29 21:23:23 UTC
Hi Thomas, Terry,

Thanks for reporting and the good pictures Thomas.
It clearly is a problem with the filters.
Now, how to resolve this issue, is a different one.

Various aspects here:
1 - in designing templates for LibreOffice (OpenOffice previously) one could better think of what works when saving as Ms x.
Using frames at the top, typically does that worse for me then using tables to position elements.

2 - the issue that the page flow from OpenOffice/LibreOffice does not work in MsOffice is wel known and reported quite often. Adding a hard return at the end of page 1 before Save As does help...
Luckily since 4.0 (IIRC) LibreOffice does support first page and next pages having different headers footers.

3 - Still of course there are things to improve in the filters. Despite what I wrote in 1 and 2, some definitely good to have.
Now for convenience of people working in QA - recognising reported bugs - and developers - working on one issue per time - it is necessary to have one issue for one bug. So the problems needs to be split. Terry already did great work there. Thanks :)  This also makes it easier to find duplicate issues

4 - each separate reported issue could benefit from a test document that focusses on that one single issue.

Maybe I did forgot something?
Again: thanks a lot,
Cor
Comment 6 Yousuf Philips (jay) (retired) 2014-09-30 01:18:38 UTC
Created attachment 107093 [details]
PDF exports from MS Word 2007 & 2013 for both DOC and DOCX

Hi Terrence,

You humble me with your kind words. :)

Well there are a number of problems in the doc and docx exports, but the biggest one with this sample file is that headers and footers are not being preserved.

With the doc, the original document's page 2 header and footer appears on page 3, and page 2 has the header spacing of page 1 without any content and the footer has the content from page 1.

With the docx, the original document's page 2 header and footer is lost, so pages 2 and possibly 3 have page 1's header spacing without any content and page 1's footer with content.

I think this bug should be changed to solve this issue and my tests were done on master (2014-09-27) against office 2007 and 2013.

I've attached PDFs of both doc and docx exports rendered in Word 2007 and 2013 as the screenshots provided only show 2 pages when 3 pages are displayed in MS Word. Another reason for attaching them is that exports from 2007 and 2013 are different (primarily the docx).
Comment 7 Thomas Krumbein 2014-09-30 05:32:58 UTC
Created attachment 107098 [details]
German Description on how the original Document was created

Hi all,

I have add a german description on how the original Template (*.ott) of the sample file was created. This may help to understand the basic Structures of the document.

You have mentioned a lot of topics, witch may works wrong - one mein topic is from my point of view:

The odt-file uses two pagestyle template (BK-1Seite, BK_Folgeseite) witch are defined als Follower.

Both exports (doc and docx) could not handle this. In both cases the first page will be renamed to "Standard", and it is used for all pages. That the reason for the 7 cm header-space on page two.

The second page style is renamed two "Konvert1" - it is still present, but unused. 

Maybe that helps for analyse:)

Best regards
Thomas
Comment 8 Cor Nouws 2014-09-30 07:25:34 UTC
(In reply to comment #7)

> Both exports (doc and docx) could not handle this. In both cases the first
> page will be renamed to "Standard", and it is used for all pages. That the
> reason for the 7 cm header-space on page two.

Am I wrong to think that we've seen about a hundred reports for this in OpenOffice and LibreOffice over the past years, or is it really new?
Comment 9 Owen Genat (retired) 2014-09-30 13:20:32 UTC
(In reply to comment #8)
> (In reply to comment #7)
> 
> > Both exports (doc and docx) could not handle this. In both cases the first
> > page will be renamed to "Standard", and it is used for all pages. That the
> > reason for the 7 cm header-space on page two.
> 
> Am I wrong to think that we've seen about a hundred reports for this in
> OpenOffice and LibreOffice over the past years, or is it really new?

You are not wrong Cor. The MS Binary / OOXML specifications do not support page styles (or list styles), thus the export filter does what it can. This aspect of the report is not a bug. The remaining text box / frame issues are likely also duplicates of other reports.
Comment 10 Jacques Guilleron 2014-09-30 13:44:55 UTC
Hi all,

I have no valid explanation at this time, but a Page break give a better result.

regards,

Jacques
Comment 11 Robinson Tryon (qubit) 2014-12-29 22:10:50 UTC
(In reply to Owen Genat from comment #9)
> You are not wrong Cor. The MS Binary / OOXML specifications do not support
> page styles (or list styles), 

Blocks -> 87761

> The remaining text box / frame issues are
> likely also duplicates of other reports.

Should we match up the remaining issues with known bug reports, or just go ahead and resolve this bug as a dupe?
Comment 12 Cor Nouws 2014-12-30 09:09:01 UTC
(In reply to Robinson Tryon (qubit) from comment #11)

> Should we match up the remaining issues with known bug reports, or just go
> ahead and resolve this bug as a dupe?

Yes, resolve as dupe would be best.
Comment 13 Robinson Tryon (qubit) 2015-01-11 01:28:02 UTC
(In reply to Cor Nouws from comment #12)
> (In reply to Robinson Tryon (qubit) from comment #11)
> 
> > Should we match up the remaining issues with known bug reports, or just go
> > ahead and resolve this bug as a dupe?
> 
> Yes, resolve as dupe would be best.

Sure -- what's the bug #?
Comment 14 Cor Nouws 2015-01-11 10:00:35 UTC
(In reply to Robinson Tryon (qubit) from comment #13)
> 
> Sure -- what's the bug #?

I see bug 48741. 
Maybe that should be changed to component 'filter' and summary starting with [META] ?
Comment 15 Robinson Tryon (qubit) 2015-01-11 10:04:42 UTC
(In reply to Cor Nouws from comment #14)
> (In reply to Robinson Tryon (qubit) from comment #13)
> > 
> > Sure -- what's the bug #?
> 
> I see bug 48741. 

Status -> RESOLVED DUPLICATE of bug 48741

> Maybe that should be changed to component 'filter' and summary starting with
> [META] ?

Sure, I can do that.

*** This bug has been marked as a duplicate of bug 48741 ***