Bug 99501 - FILESAVE: corrupt docx produced for paragraph with page-break-before following table
Summary: FILESAVE: corrupt docx produced for paragraph with page-break-before followin...
Status: RESOLVED WORKSFORME
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
4.3.0.4 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: bisected, filter:docx, regression
Depends on:
Blocks:
 
Reported: 2016-04-26 01:19 UTC by Luke Deller
Modified: 2016-08-23 06:06 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:
Regression By:


Attachments
Test case: page-break-before on paragraph after table (4.31 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2016-04-26 01:19 UTC, Luke Deller
Details
.odt with table and paragraph with page-break-before (8.60 KB, application/vnd.oasis.opendocument.text)
2016-04-29 13:37 UTC, Terrence Enger
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Luke Deller 2016-04-26 01:19:54 UTC
Created attachment 124634 [details]
Test case: page-break-before on paragraph after table

The attached document has a table, immediately followed by a paragraph which is set to have a page break before it.
(Right click -> "Paragraph" -> "Text Flow" tab -> "Breaks")

When saving this file to docx format, a file is produced which Microsoft Word cannot open.  I think this is because a text run <w:r> containing a page break is inserted as a child of the body element <w:body>, but text run elements are not allowed there.

Steps to reproduce:
1) in LibreOffice open the attached pagebreakaftertable.odt
2) save as .docx, close
3) try to open the docx file in Microsoft Word - fails with "Unspecified error" in /word/document.xml
4) try to open the docx file in LibreOffice: it actually opens but note that the page break has been lost
Comment 1 Terrence Enger 2016-04-29 13:37:37 UTC
Created attachment 124727 [details]
.odt with table and paragraph with page-break-before

I am setting status NEW and keywords regression, bisected.

As Luke's attachment is a .docx, I created the attached
dbgutil_20160427.odt.  In each version under test, I did...
(1) Open dbgutil_20160427.odt from the command line.
(2) Save as Microsoft Word 2006-2013 XML (.docx)
(3) Close the documemnt.
(4) In Start Center, take menu options File > "Recent Documents" and
    open the just-saved .docx.


Working in the 43max bibisect repository, I have reached ...

    7c61549622652f6e098fd66456c2d98efeff27fa is the first bad commit
    commit 7c61549622652f6e098fd66456c2d98efeff27fa
    Author: Matthew Francis <mjay.francis@gmail.com>
    Date:   Thu May 28 19:31:39 2015 +0800

        source-hash-a31fbb53dba76736b37213b98b64937f05929a67
    
        commit a31fbb53dba76736b37213b98b64937f05929a67
        Author:     Pallavi Jadhav <pallavi.jadhav@synerzip.com>
        AuthorDate: Thu Feb 6 13:58:03 2014 +0530
        Commit:     Miklos Vajna <vmiklos@collabora.co.uk>
        CommitDate: Wed Feb 26 10:50:08 2014 +0100
    
            fdo#74566:DOCX: Preservation <w:br> tag for Break to Next Page
    
                    Issue :
                    'Break to Next Page' gets converted to 'Page Break Before'
                     in RT.
    
                    XML diffrenece :
                    - LO exports <w:br> as <w:pageBreakBefore /> in document.xml
                    - The page break is written into wrong paragraph.

                    Implementation :
                    1] Removed implementation to export <w:pageBreakBefore />.
                    2] Added a check to write <w:br> in correct paragraph.
                    3] Modified code to handle SectionBreak() even if Text node
               	       has no string.
                       It is required when DOCX contains a PageBreak with footer.
                    4] Written Export Unit Test case.

            Conflicts:
                    sw/qa/extras/ooxmlexport/ooxmlexport.cxx
            Reviewed on:
                    https://gerrit.libreoffice.org/7891

            Change-Id: I237b9c5fdd3083b441f6e81cd8442f458eccf1a0


Bugzilla search shows several possible dups or near misses.  I am
arbitrarily choosing to confirm this report.
Comment 2 Cor Nouws 2016-05-20 15:03:21 UTC
the commit early 2014 makes it good to set 4.3 as first affected version
Comment 3 Justin L 2016-08-23 06:06:18 UTC
broken in 5.1.4
Works for me - 5.3 and 5.1.5
Almost certainly fixed by bug 99090 since a bibisect of when the bug was fixed falls into that range.