Bug 166544 - FILEOPEN DOCX: no top spacing after Field-with-page-break
Summary: FILEOPEN DOCX: no top spacing after Field-with-page-break
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Justin L
URL:
Whiteboard: compatibilityMode14 target:25.8.0 tar...
Keywords: filter:doc, filter:docx
Depends on: 166552
Blocks: DOCX-Paragraph DOC-Paragraph
  Show dependency treegraph
 
Reported: 2025-05-12 14:38 UTC by Justin L
Modified: 2025-05-16 16:47 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments
TopMarginAfterTOC.docx (18.15 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2025-05-12 14:38 UTC, Justin L
Details
LegacyField_noTopMargin.docx: same issue seen when creating text fieldmark (12.79 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2025-05-15 19:21 UTC, Justin L
Details
166544_topMargin_simple.doc: a super simple example for DOC format (21.50 KB, application/msword)
2025-05-15 21:42 UTC, Justin L
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Justin L 2025-05-12 14:38:23 UTC
Created attachment 200754 [details]
TopMarginAfterTOC.docx

Normally, after a normal page break, the top paragraph spacing is applied. And we do that in LO. However, it seems that Microsoft makes an exception to the case where the page break comes after a Table of Contents field. In that particular case, no top margin is applied.

Already true in OOo 3.3, so Inherited.

Steps to reproduce:
1.) open TopMarginAfterTOC.docx

On page 2 is the ToC. Notice the big gap between the header and the heading 1 paragraph "Table of Contents" with 48pt of "above spacing". That's all good.

On page 3 is where the problem is seen. In LO we have the same large gap before the heading 1 paragraph "Chapter 1 48pt", while in MS Office there is no gap at all.

This oddity was found by looking at overlays from these documents:
abi5309-1.doc
moz1003146-1.doc
novell657806-1.doc
ooo119706-1.doc
ooo23187-1.doc
ooo53044-1.doc

Watch out though for a section page break - that is handled differently... forum-mso-de-102317.docx and forum-mso-en-13234.docx.

Found by Collabora's mso-test
Comment 1 Justin L 2025-05-15 15:42:37 UTC
(In reply to Justin L from comment #0)
> Normally, after a normal page break, the top paragraph spacing is applied.
This statement always needs to be qualified... This bug report is primarily focusing on DOC format (although a DOCX sample has been provided). IIRC, if the page break comes before any text (runs), then compat14 normally applies the top margin after the page break (as I stated above). TopMarginAfterTOC.docx is compat14, so we would have expected a top margin - but following the TOC seems to be an exception...
Comment 2 Justin L 2025-05-15 19:14:31 UTC
In the docx file, I can see that the first w:r run is the end of the TOC fieldmark. So technically the page break is not at the beginning of the paragraph content. The TOC is in the same paragraph as the page break, and precedes the break.

Thankfully, our TOC is not built on top of bookmarks - so our TOC can't end half-way through the next paragraph. But can our legacy form fields?
Comment 3 Justin L 2025-05-15 19:21:50 UTC
Created attachment 200835 [details]
LegacyField_noTopMargin.docx: same issue seen when creating text fieldmark

I was able to make another fieldmark example using legacy text form fields.

[Note: using Word 2010 to create this example, I had to check "Split apart page break and paragraph marks" in the advanced Layout Options. This compat setting seems to ONLY affect interactive UI use, not layout itself since the layout didn't change when I turned it off again. It is off in this example document.]
Comment 4 Justin L 2025-05-15 21:42:10 UTC
Created attachment 200838 [details]
166544_topMargin_simple.doc: a super simple example for DOC format

I have a fix for DOCX at https://gerrit.libreoffice.org/c/core/+/185375.

For DOC format the problem is a little worse because we don't have the building blocks in place for this, as shown by 166544_topMargin_simple.doc. (Oh, I created a bug report for that already - bug 166552).

In this example, we have a clear character run ("x"), followed by a w:br page break, followed by the remainder of the paragraph. Obviously the top margin has been applied before the "x" on page 1, so it should not be applied again after the page break - since this is the same paragraph (in MS Word anyway).
Comment 5 Commit Notification 2025-05-15 23:58:43 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/5c7b3f5dc1f14081eed380999dc029a500784d55

tdf#166544 writerfilter page break: field-end counts as a character run

It will be available in 25.8.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 6 Commit Notification 2025-05-16 07:09:23 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "libreoffice-25-2":

https://git.libreoffice.org/core/commit/a993c7849f0cc43c05ea8a505e38b44badc7539c

tdf#166544 writerfilter page break: field-end counts as a character run

It will be available in 25.2.4.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 7 Justin L 2025-05-16 16:46:21 UTC
Let us just ignore the DOC portion of this report. It can be handled when the underlying requirements are taken care of.