Bug 98000 - FILESAVE DOCX Problem with size of headers
Summary: FILESAVE DOCX Problem with size of headers
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
4.1.0.4 release
Hardware: All All
: medium normal
Assignee: Justin L
URL:
Whiteboard: target:7.1.0
Keywords: filter:docx, notBibisectable
Depends on:
Blocks: DOCX-Header-Footer
  Show dependency treegraph
 
Reported: 2016-02-19 08:24 UTC by Kenny Moens <Cipal IT Solutions nv>
Modified: 2021-05-27 20:47 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
Conversion Examples (1.10 MB, application/x-zip-compressed)
2016-02-19 08:24 UTC, Kenny Moens <Cipal IT Solutions nv>
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Kenny Moens <Cipal IT Solutions nv> 2016-02-19 08:24:25 UTC
Created attachment 122791 [details]
Conversion Examples

We have a number of test documents which we want to convert from ODT to DOCX format. Herein we see a number of issues (see attachments for details):

Opmaakprofielen.odt > Opmaakprofielen.docx
* Page 1: Borders around the paragraph are not properly restored
* Page 2: Image is not restored in DOCX
* Page 2: The "adres" table, which is part of the header, is reproduced at the correct location, but the size of the header is too small (compared to LibreOffice ODT), which causes the tekst "Eerste pagina profiel" to appear too high.

Ontvangsbewijs.odt > Ontvangstbewijs.docx
* Header: the text "Dienst ruimtelijke ordening" appears above the logo in the converted document (DOCX), while it is effectively below the logo in the original document. This appears to be caused by sizing of the header again.
Comment 1 Timur 2016-02-19 09:12:07 UTC
It's highly unlikely that any bugs of type "multiple problems/bad rendering", will be fixed, so this bug shouldn't be confirmed.
Each issue (section break, paragraph break, text box, picture...) should be reported separately, after a search for already reported bugs.
Only if bugs don't exist, they should be reported separately, even if they happen with the same file.
Comment 2 Kenny Moens <Cipal IT Solutions nv> 2016-02-19 09:20:55 UTC
Changed this report to only contain the issue about the header height where something goes wrong. It appears to me caused by the height of the header.

So the problem statement of this issue is narrowed down to:

Opmaakprofielen.odt > Opmaakprofielen.docx
* Page 2: The "adres" table, which is part of the header, is reproduced at the correct location, but the size of the header is too small (compared to LibreOffice ODT), which causes the tekst "Eerste pagina profiel" to appear too high.

Ontvangsbewijs.odt > Ontvangstbewijs.docx
* Header: the text "Dienst ruimtelijke ordening" appears above the logo in the converted document (DOCX), while it is effectively below the logo in the original document. Caused by the height of the header which seems to be incorrectly converted

For the other items, I'll do a further search and create new issues (if needed).
Comment 3 Buovjaga 2016-02-26 18:16:38 UTC
Reproduced by saving odts to docxes.

Win 7 Pro 64-bit Version: 5.2.0.0.alpha0+
Build ID: ef02de2698d90fd874bddf3146165cbe85487bc5
CPU Threads: 4; OS Version: Windows 6.1; UI Render: default; 
TinderBox: Win-x86@39, Branch:master, Time: 2016-02-19_23:40:50
Locale: fi-FI (fi_FI)
Comment 4 Justin L 2016-08-23 07:05:50 UTC
bibisected-43all indicates a regression in LO4.2
# good: [e33eaf662f84503c8de782d6677d9eb1b0b0d96b] source-hash-6c3d74e8b779b1eb2d9779ed84f1518e078113c4
# first bad commit: [f2554751603ad8537257b3cf52d6171056c76eeb] source-hash-f42768fe0b60ecbbe9c68d775329bf28c0690131

I don't have a good guess as to which of those commits caused the problem.
Comment 5 Justin L 2016-08-31 07:12:16 UTC
(In reply to Justin L from comment #4)
> I don't have a good guess as to which of those commits caused the problem.

The linux 43all bibisect appears to be misleading, even though I double-checked the results.  I have made a clean compile of last41onmaster and exhibited the same problem - the text appears above the logo.

last41onmaster: f160e4935c474a5293b3d3c11b3d538efb4767a0 CommitDate: Mon May 20 14:30:50 2013 +0300

I failed to compile last40onmaster (precise 12.04). Marking as non-bibisectable. Perhaps someone can run a MacOS bibisect to see if they get different results.
Comment 6 Justin L 2016-10-21 13:12:05 UTC
bibisected with the new 42max set.  Unfortunately it ran into a string of 16 commits where office wouldn't start.

The last good commit was 40ff64b93fc4c4c2e2710853e9f71e35811b9362
    Author:     Eike Rathke <erack@redhat.com>
    CommitDate: Fri Aug 30 13:54:48 2013 +0200
    and another one for fdo#68740

so the bad commit is somewhere between that one and the next time that LO could start again, which was commit 47924a46566352dd99a14163d98bd2b51cca6b0e
    Author:     Julien Nabet <serval2412@yahoo.fr>
    CommitDate: Fri Aug 30 20:58:07 2013 +0200
    Revert "-Werror=unused-but-set-variable bCategoriesApplied"

My guess would be that it was caused by author	Michael Stahl <mstahl@redhat.com>	2013-08-30 17:13:40 (GMT)
commit f8307e5ae11e8235fa1fb88ed52625bf9c650dc2
fdo#41068: writerfilter: fix image wrap polygon import

but things have really moved around alot since then:
commit  003434f1e2f4bd7ec08d2428fe2b90c11e680cef
Author: Jan Holesovsky <kendy@collabora.com>
Date:   Tue Jul 15 07:25:38 2014 +0200
fdo#76803: Kill resourcemodel::Fraction, and use Fraction from tools instead.
Comment 7 QA Administrators 2018-06-26 02:46:01 UTC Comment hidden (obsolete, spam)
Comment 8 Justin L 2019-08-02 08:08:54 UTC
I tried to take the original .ODT files and change the page-breaks into "page-break-with-style" and the program failed to do that properly. So I'm not going to bother looking into this further since the UI doesn't even work with this document - perhaps because of all of the shapes.

I expect OP's problem to be related to next-style pointing to a different style which has a much different header size, and the page-break just being a simple one instead of being explicit about what page style should be applied.
Comment 9 Justin L 2020-07-21 13:26:50 UTC
Yeah, so this one will require export logic something like:
if regular page break
   and page-style-before has a follow style
then
   change page break into page break with style

.DOC format already does this. So very likely there are common situations where NOT doing this is desirable. Thus, I like having doc and docx do things differently sometimes - that way at least one compatible format should fit the bill. [But let me look a bit further into DOCX to see if maybe the logic can be tweaked.]
Comment 10 Justin L 2020-07-21 15:50:42 UTC
So, these two synerzip commits look really, umm, interesting - like a "fix my one document so that it works" kind of fixes.

4.3 commit a31fbb53dba76736b37213b98b64937f05929a67
Author: Pallavi Jadhav on Thu Feb 6 13:58:03 2014 +0530
    tdf#74566:DOCX: Preservation <w:br> tag for Break to Next Page
    
            Issue :
            'Break to Next Page' gets converted to 'Page Break Before'
             in RT.
    
            XML diffrenece :
            - LO exports <w:br> as <w:pageBreakBefore /> in document.xml
            - The page break is written into wrong paragraph.
    
            Implementation :
            1] Removed implementation to export <w:pageBreakBefore />.
            2] Added a check to write <w:br> in correct paragraph.
            3] Modified code to handle SectionBreak() even if Text node
               has no string.
               It is required when DOCX contains a PageBreak with footer.
            4] Written Export Unit Test case.

and 4.3 commit e9b2787c2ece4c8260fbac6359257e1829c917d4
Author: umeshkadam on Fri May 2 13:25:15 2014 +0530
    tdf#77890: page break exported as section break if different 1st page is set
    
    - Page break was getting exported as section break in case if the different
      first page was set.
    - Fixed this issue and added a UT.
    - For additional details regarding the issue please check the following
       https://www.libreoffice.org/bugzilla/show_bug.cgi?id=77890#c2
Comment 11 Justin L 2020-07-21 18:35:41 UTC
proposed fix at http://gerrit.libreoffice.org/c/core/+/99171
Comment 12 Commit Notification 2020-07-22 07:38:15 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/140e8861566afcd1c51ede4bafd9ac2c6192cd68

tdf#98000 docx export: blank paragraphs don't affect page breaks

It will be available in 7.1.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 13 Commit Notification 2020-07-24 14:05:56 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/4b2f0758c99e7b50293d965699f2125ff79bee17

tdf#98000 docx export: cleanup obsolete clause

It will be available in 7.1.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 14 Commit Notification 2020-07-24 14:08:07 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/f19dd5a6d3b65556f6dfb8ee0e52333e422733f0

NFC tdf#98000 docx export: more general code cleanup

It will be available in 7.1.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 15 Timur 2020-07-24 15:53:03 UTC
Real detective work done here by Justin. 
I put mentioned bugs to See Also.