Bug 136952

Summary: DOCX: missing w:br page break before section break containing pageBreakBefore=false
Product: LibreOffice Reporter: Justin L <jluth>
Component: WriterAssignee: Justin L <jluth>
Status: RESOLVED FIXED    
Severity: normal CC: jluth, telesto
Priority: medium Keywords: bibisected, bisected, filter:docx, regression
Version: 4.4.0.3 release   
Hardware: All   
OS: All   
See Also: https://bugs.documentfoundation.org/show_bug.cgi?id=132149
Whiteboard: target:7.1.0
Crash report or crash signature: Regression By:
Attachments: tdf132149_pgBreakBRT.docx: even when correctly exported, the import misses the page break

Description Justin L 2020-09-22 12:09:51 UTC
When exporting this document, the page-break-after is lost. Even if it is added back in, it is lost on import. So both import and export have problems.


This is a follow up from bug 132149, so look there for export code pointers.
Comment 1 Justin L 2020-09-22 12:42:21 UTC
Created attachment 165771 [details]
tdf132149_pgBreakBRT.docx: even when correctly exported, the import misses the page break

Probably best is to start with this import document. MSWord 2003 and 2016 both page break between the yellow and white paragraphs.

The import was working until LO 4.4 in the range
https://cgit.freedesktop.org/libreoffice/core/log/?qt=range&q=4fd65ac3292a219162a19d8cf1d06842a4c4d498..66da64c74829a68a8dc55c9380ecd6c84d0fc331

with the likely candidate being https://cgit.freedesktop.org/libreoffice/core/commit/?id=382bab9412b87f82da82276332496eb28b28d4f3
DOCX import: fix <w:pageBreakBefore> wrt. inherited styles
We used to ignore this element with a "false" logical attribute, but
that causes a problem when an inherited style wants to explicitly
disable this element from a parent style.
Comment 2 Justin L 2020-09-22 12:46:30 UTC
This is focusing on docx import and export.
DOC is fine in terms of that specific page break for both export and import.
Comment 3 Justin L 2020-09-23 05:35:38 UTC
There seems to be some slight import difference in MSWord versions on how to handle a PageBreakBefore=true that is preceded by a w:br. In Word 2003, you get two page breaks. In Word 2010 and 2016, you only get one. (So this is probably  just a Word 2003+compatibility pack problem.)

There are two examples in the unit tests. ooxmlexport9's tdf89377_tableWithBreakBeforeParaStyle.docx is a good example. When it is round-tripped in LO, it shows as 5 pages in Word 2003, but only 3 pages in Word 2016.

The other example is ooxmlexport13's internal_hyperlink_table.odt.


Proposed for the import regression is at https://gerrit.libreoffice.org/c/core/+/103223
Comment 4 Justin L 2020-09-23 19:38:52 UTC
Proposed patch for export is at https://gerrit.libreoffice.org/c/core/+/103275
Comment 5 Commit Notification 2020-11-04 07:17:51 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/0b8c7a94c6c4b1aa10c7ae0a614d0f1f6cba1002

tdf#136952 writerfilter: PageBreakBefore - don't always overwrite

It will be available in 7.1.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 6 Commit Notification 2020-11-06 14:08:21 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/f0a495a56489b781177be8ff28c4660214c9bdf2

tdf#136952 ww8export: always check for breakAfter on last split

It will be available in 7.1.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.