Bug 166503 - FILEOPEN DOC/X: need to honour "don't use HTML paragraph auto spacing" to always add full top margin at page break
Summary: FILEOPEN DOC/X: need to honour "don't use HTML paragraph auto spacing" to alw...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
25.2.3.2 release
Hardware: All All
: medium normal
Assignee: Justin L
URL:
Whiteboard: target:25.8.0 target:25.2.4
Keywords: bibisected, filter:doc, filter:docx, regression
Depends on:
Blocks: DOCX-Paragraph DOC-Paragraph
  Show dependency treegraph
 
Reported: 2025-05-08 20:11 UTC by Justin L
Modified: 2025-05-16 16:41 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
ooo19989-1.doc: the example file (96.50 KB, application/msword)
2025-05-08 20:11 UTC, Justin L
Details
tdf130558-1.docx_import-compare-2.png: overlay where RED=LO, grayscale=MSO (78.03 KB, image/png)
2025-05-09 19:54 UTC, Justin L
Details
166503_import-compare.zip: overlay where RED=this patch, BLUE=before patch, GRAYSCALE=Word2010 (7.92 MB, application/zip)
2025-05-15 23:53 UTC, Justin L
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Justin L 2025-05-08 20:11:06 UTC
Created attachment 200707 [details]
ooo19989-1.doc: the example file

Although there are many examples of this regression, ooo19989-1.doc is an excellent example. In this case, on page 2, there should be a large gap between "Sommario" and the header, but currently there is no gap at all.

That's because of my commit 25.8 9a8eb2dc17435d290f482070dd5f20f538b5b68b
    tdf#165047 sw mso-compat layout: always consolidate top margin
which was backported to 25.2.3.

It is not so much a regression as a failure to import/use the compat flag "Don't use HTML paragraph auto spacing" (Options - Advanced - Layout Options). When this flag is turned off, then the top/bottom spacing consolidation happens.

Steps to reproduce
1.) open ooo19989-1.doc.
On page 2 there should be a 52pt / 1.83 cm gap after the header.

also seen with fdo73738-1.doc, forum-en-16602.doc, forum-en-728.doc, forum-mso-de-11906.doc, forum-mso-de-9779.doc, ooo9959-2.doc, abi5872-1.doc, fdo81736-3.doc, forum-en-5864.doc, forum-fr-10016.doc, forum-mso-de-7827.doc, gnome668491-1.doc, fdo45751-1.doc, forum-en-14254.doc, forum-en-5888.doc, forum-fr-30645.doc, forum-mso-de-9007.doc, ooo19989-1.doc, etc.

Found by Collabora's mso-test
Comment 1 Justin L 2025-05-09 19:54:09 UTC
Created attachment 200717 [details]
tdf130558-1.docx_import-compare-2.png: overlay where RED=LO, grayscale=MSO

A DOCX example is ePrivacy text pro CRP 2019_11_22 st14068.en19 (2).docx (attachment 157771 [details] from bug 130558). Almost every page exhibits this shift.

DOCX already imports this flag, but doesn't set any internal compat flag.
    m_pImpl->GetSettingsTable()->GetDoNotUseHTMLParagraphAutoSpacing()
Comment 2 Justin L 2025-05-12 19:33:04 UTC
(In reply to Justin L from comment #1)
> DOCX already imports this flag, but doesn't set any internal compat flag.
Actually, it does set DocumentSettingId::PARA_SPACE_MAX via
xSettings->setPropertyValue(u"AddParaTableSpacing"_ustr, uno::Any(m_pSettingsTable->GetDoNotUseHTMLParagraphAutoSpacing()));

and for DOC with sw/source/filter/ww8/ww8par.cxx:    m_rDoc.getIDocumentSettingAccess().set(DocumentSettingId::PARA_SPACE_MAX, m_xWDop->fDontUseHTMLAutoSpacing);
Comment 3 Commit Notification 2025-05-15 21:27:28 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/e43929ce56843ffc91341d5897883e752e5b49ec

tdf#166503 sw mso-compat layout: don't consolidate if PAGE_SPACE_MAX

It will be available in 25.8.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 4 Justin L 2025-05-15 23:53:04 UTC
Created attachment 200840 [details]
166503_import-compare.zip: overlay where RED=this patch, BLUE=before patch, GRAYSCALE=Word2010

While creating the patch, the DOCX overlays showed improvement relative to my "authoritative" Word 2019 PDFs. However, many of the DOC PDFs suggested that it got worse.

But there were lots of clues that likely that Word 2019's PDFs just captured a state where the layout was still in motion. So I interactively used Word 2010/XP (which likely better matches the font versions available to me on Linux anyway) to re-create the "authoritative" PDFs. Since it was interactive, the layout had finalized before these PDFs were created, and then I could see the improvement I expected. (I could see huge differences between the two "authoritative" PDFs.)

Because the results take up so many megabytes, I redid them with a very low 40 dpi resolution to attach here. The overall trend is that current state (RED) is now overlaid by the grayscale, while the previous state (BLUE) is offset higher on the page. So this is good confirmation that my patch is correct.

My main purpose in documenting this is to show that I did do fairly exhaustive testing. Plus, it helps to demonstrate the limitation of doing PDF comparisons.
Comment 5 Commit Notification 2025-05-16 08:40:38 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "libreoffice-25-2":

https://git.libreoffice.org/core/commit/8be81fdaa7b7313d083457c64c8acddee2489991

tdf#166503 sw mso-compat layout: don't consolidate if PAGE_SPACE_MAX

It will be available in 25.2.4.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.