Bug 132149 - MAILMERGE Saving to single DOCX eats last paragraph
Summary: MAILMERGE Saving to single DOCX eats last paragraph
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
6.1.0.3 release
Hardware: All All
: medium normal
Assignee: Justin L
URL:
Whiteboard: target:7.1.0
Keywords: bibisected, bisected, filter:docx, regression
Depends on:
Blocks:
 
Reported: 2020-04-16 13:08 UTC by NISZ LibreOffice Team
Modified: 2020-11-06 14:13 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
Example base document with closing empty paragraph (29.81 KB, application/vnd.oasis.opendocument.text)
2020-04-16 13:08 UTC, NISZ LibreOffice Team
Details
Example base document without closing empty paragraph (29.20 KB, application/vnd.oasis.opendocument.text)
2020-04-16 13:09 UTC, NISZ LibreOffice Team
Details
Very simple data source for the base documents (11.01 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2020-04-16 13:09 UTC, NISZ LibreOffice Team
Details
ODT merged from the base document with closing empty paragraph (29.70 KB, application/vnd.oasis.opendocument.text)
2020-04-16 13:11 UTC, NISZ LibreOffice Team
Details
DOCX merged from the base document with closing empty paragraph (41.02 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2020-04-16 13:12 UTC, NISZ LibreOffice Team
Details
Screenshot of the two documents merged from the base document with closing empty paragraph (77.11 KB, image/png)
2020-04-16 13:12 UTC, NISZ LibreOffice Team
Details
ODT merged from the base document without closing empty paragraph (29.18 KB, application/vnd.oasis.opendocument.text)
2020-04-16 13:12 UTC, NISZ LibreOffice Team
Details
DOCX merged from the base document without closing empty paragraph (37.75 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2020-04-16 13:13 UTC, NISZ LibreOffice Team
Details
Screenshot of the two documents merged from the base document without closing empty paragraph (89.66 KB, image/png)
2020-04-16 13:13 UTC, NISZ LibreOffice Team
Details
MM.odt: 4 letters combined into one document. Tweaked endings to prove point. (30.32 KB, application/vnd.oasis.opendocument.text)
2020-08-25 12:28 UTC, Justin L
Details
132149_pgBreak3.odt: a stress test against Tamas' patch (13.97 KB, application/vnd.oasis.opendocument.text)
2020-08-26 11:53 UTC, Justin L
Details
tdf132149_pgBreakB.odt: a second stress test -focusing on Page::BreakAfter (14.02 KB, application/vnd.oasis.opendocument.text)
2020-09-22 11:56 UTC, Justin L
Details

Note You need to log in before you can comment on or make changes to this bug.
Description NISZ LibreOffice Team 2020-04-16 13:08:48 UTC
Created attachment 159625 [details]
Example base document with closing empty paragraph

When a multiple page mailmerge document is saved as a single DOCX, the last empty paragraph of the source document disappears. If there is no such paragraph the delimiter page break between individual documents is misplaced.

Steps to reproduce:
    1. Open attached mail merge document MM-basedoc.odt and set up attached data source MM-datasource.xlsx
    2. Use the Save Merged Documents button on the MM toolbar, save as odt, then do it again as docx.
    3. Open the other attached mail merge document MM-basedoc-N.odt, it is the same as the  MM-basedoc.odt except for the closing empty paragraph.
    4. Use the Save Merged Documents button on the MM toolbar, save as odt, then do it again as docx.

Actual results:
The unified docx documents do not have the last empty paragraph of the base document, while the odt documents do. 
The docx document unified from the MM-basedoc-N.odt file looks terrible: there is a very misplaced page break between individual documents and all pages have the first page header and footer.

Expected results:
Similar to the unified odt files.

LibreOffice details:
Version: 7.0.0.0.alpha0+ (x64)
Build ID: 94a7ceae287a7967e8f013d012673e26637c6bb5
CPU threads: 4; OS: Windows 6.3 Build 9600; UI render: Skia/Raster; VCL: win; 
Locale: hu-HU (hu_HU); UI-Language: en-US
Calc: CL

Also happens in:
Verzió: 6.1.0.3
Build az.: efb621ed25068d70781dc026f7e9c5187a4decd1
CPU szálak: 4; OS: Windows 6.3; Felületmegjelenítés: GL; 
Területi beállítások: hu-HU (hu_HU); Calc: CL

Does not happen in: 
Verzió: 6.0.0.3
Build az.: 64a0f66915f38c6217de274f0aa8e15618924765
CPU szálak: 4; OS: Windows 6.3; Felületmegjelenítés: GL; 
Területi beállítások: hu-HU (hu_HU); Calc: CL

(page breaks here are correct, only the first page header is present on the second pages)
Comment 1 NISZ LibreOffice Team 2020-04-16 13:09:33 UTC
Created attachment 159626 [details]
Example base document without closing empty paragraph
Comment 2 NISZ LibreOffice Team 2020-04-16 13:09:51 UTC
Created attachment 159627 [details]
Very simple data source for the base documents
Comment 3 NISZ LibreOffice Team 2020-04-16 13:11:51 UTC
Created attachment 159628 [details]
ODT merged from the base document with closing empty paragraph
Comment 4 NISZ LibreOffice Team 2020-04-16 13:12:12 UTC
Created attachment 159629 [details]
DOCX merged from the base document with closing empty paragraph
Comment 5 NISZ LibreOffice Team 2020-04-16 13:12:37 UTC
Created attachment 159630 [details]
Screenshot of the two documents merged from the base document with closing empty paragraph
Comment 6 NISZ LibreOffice Team 2020-04-16 13:12:58 UTC
Created attachment 159631 [details]
ODT merged from the base document without closing empty paragraph
Comment 7 NISZ LibreOffice Team 2020-04-16 13:13:14 UTC
Created attachment 159632 [details]
DOCX merged from the base document without closing empty paragraph
Comment 8 NISZ LibreOffice Team 2020-04-16 13:13:33 UTC
Created attachment 159633 [details]
Screenshot of the two documents merged from the base document without closing empty paragraph
Comment 9 Aron Budea 2020-04-16 20:46:55 UTC
Bug report says bisected, but I'm not seeing any commit references.
Comment 10 NISZ LibreOffice Team 2020-04-30 12:43:45 UTC
Bibisected using bibisect-win32-6.1 to:
URL: https://cgit.freedesktop.org/libreoffice/core/commit/?id=c1d58c46eec5081576979f584151c7e9a4f67fe0

author
Tamas Bunth <tamas.bunth@collabora.co.uk> Fri Dec 01 14:58:17 2017 +0100 
committer
Tamás Bunth <btomi96@gmail.com> Fri Dec 08 16:14:59 2017 +0100 

tdf#41650 DOCX export: split para on section break
Comment 11 Justin L 2020-08-25 12:28:27 UTC
Created attachment 164672 [details]
MM.odt: 4 letters combined into one document. Tweaked endings to prove point.

This doesn't really have anything to do with mail merges. That was just the way this particular document was generated.  Attached here is a modified output from the mail-merge. On the first two letters I removed the extra paragraph at the end.

There are several very different parts to the is bug report.  The "loss" of the very last, empty paragraph marker does NOT connect to Tamas' patch. That is already true in 5.2 at least. I'm sure that is connected to bRemove. I wouldn't consider that a bug, is basically irrelevant to this bug report.

The bibisect was watching the loss of the page break between the four letters. However, it is worth noting that before Tamas' commit, all DOCX pages also contained the first page header. So Tamas' fix solved the problem of MM-basedoc.odt (which now properly excludes the header on the second page). However, for MM-basedoc-N.odt it looks like it swallowed the page break on the paragraph that spanned the two pages.
Comment 12 Justin L 2020-08-26 10:38:57 UTC
A draft fix for this can be found at http://gerrit.libreoffice.org/c/core/+/101344.

I'd like to understand this area a bit better before proposing it though.
Comment 13 Justin L 2020-08-26 11:53:48 UTC
Created attachment 164711 [details]
132149_pgBreak3.odt: a stress test against Tamas' patch
Comment 14 Commit Notification 2020-08-27 18:55:21 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/e538c63c0d55b581332f4146dab26e26eb611dce

related tdf#132149 ww8 export: unit test to prevent bad fix

It will be available in 7.1.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 15 Commit Notification 2020-08-31 12:00:21 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/a0da393cac0a44c648238b815970245684173c99

tdf#132149 ww8export: nextNode has nothing to do with pageDesc

It will be available in 7.1.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 16 Commit Notification 2020-09-21 16:42:38 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/d08bbf4a1b1a62ef1f52665f52ed8880792c64ef

tdf#132149 ww8export: always check for break at end of paragraph

It will be available in 7.1.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 17 Justin L 2020-09-22 11:56:41 UTC
Created attachment 165766 [details]
tdf132149_pgBreakB.odt: a second stress test -focusing on Page::BreakAfter
Comment 18 Justin L 2020-09-22 12:56:27 UTC
(In reply to Justin L from comment #17)
> tdf132149_pgBreakB.odt: a second stress test -focusing on Page::BreakAfter

I spun this off into bug 136952.
Comment 19 Commit Notification 2020-09-24 17:19:40 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/5a234ba7f02660ab770f2744d0b936e5607ddafe

tdf#132149 ww8export: use left page properties for left style

It will be available in 7.1.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 20 Commit Notification 2020-10-28 09:38:32 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/28dddd4f7e255c74c17c0c6b263303f4567b5678

tdf#132149 ww8export: respect ginormous paragraphs

It will be available in 7.1.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.