Bug 148309 - Mail-merging a complex document with many data records significantly slower after fix for bug 144565
Summary: Mail-merging a complex document with many data records significantly slower a...
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
7.4.0.0 alpha0+
Hardware: All All
: medium normal
Assignee: Michael Stahl (allotropia)
URL:
Whiteboard: target:7.4.0 target:7.3.5
Keywords: bibisected, bisected, perf, regression
Depends on:
Blocks: Mail-Merge redlinehide-regressions
  Show dependency treegraph
 
Reported: 2022-04-01 15:26 UTC by Michael Weghorn
Modified: 2022-08-11 10:26 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
sample doc for mail merge (42.62 KB, application/vnd.oasis.opendocument.text)
2022-04-01 15:26 UTC, Michael Weghorn
Details
sample database containing 1000 dummy entries (65.59 KB, application/vnd.oasis.opendocument.spreadsheet)
2022-04-01 15:27 UTC, Michael Weghorn
Details
Flamegraph (430.74 KB, application/x-bzip)
2022-04-02 09:10 UTC, Julien Nabet
Details
Generated merge file with 1000 letters (521.18 KB, application/vnd.oasis.opendocument.text)
2022-08-11 10:26 UTC, Gabor Kelemen (allotropia)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Weghorn 2022-04-01 15:26:45 UTC
Created attachment 179256 [details]
sample doc for mail merge

With the fix for tdf#144565 in place, performing a mail merge of the attached document with many data records has become significantly slower.

# Steps to reproduce

1) open attached sample document "sample_mail_merge.odt"
2) run mail merge wizard ("Tools" -> "Mail merge wizard")
3) use the attached ODS file "1000ds.ods" that contains 1000 records as database ("Exchange Database", then select the file)
4) finish mail merge wizard
5) select "Save Merged Documents" in the mail merge toolbar
6) leave default selection ("Save as a single large document") and press "Save Documents"
7) wait for mail merge to finish

# Actual behavior:

With the fix for tdf#144565 in place, this takes a long time.

Using commit dfaa8725a4762de874fb144f8a370b9f42f3920f (source b8f68233b8dc5a009396141fba6e47867e70f342) from the 7.4 bibisect repository in my Windows VM, the first 200-300 records went pretty fast, but then it slowed down significantly. After 20 minutes, it was at 775 out of 1000, advancing by 1 every few seconds. (I didn't wait until the end.)

## Expected behavior

Mail merge should be reasonably fast, as used to be the case earlier.
With commit 77d0c49a8b9ee59493696438d51cff11e107c3b2 (source 42448f48bb48a13d6618a181b12840db6d85c574) of the bibisect repository, the dialog asking where to save the file appeared after ~1.5 min.

## Additional information

Initially observed with mail merge from the WollMux extension [1] in a 6.4 based LO version that contains a backport of the fix for tdf#144565. The user reported that it took ~10 min with the original document without the fix for tdf#144565 in place, and 90-100 min afterwards.

[1] https://github.com/WollMux/WollMux/
Comment 1 Michael Weghorn 2022-04-01 15:27:12 UTC
Created attachment 179257 [details]
sample database containing 1000 dummy entries
Comment 2 Michael Weghorn 2022-04-01 15:29:18 UTC
Win bibisect repo for 7.4 shows that this started with


commit 42448f48bb48a13d6618a181b12840db6d85c574
Author: Michael Stahl
Date:   Thu Dec 16 13:36:46 2021 +0100

    tdf#144565 sw_redlinehide: fix mailmerge when flys anchored at last node
    
    The InsertPageBreak() calls SplitNode() which is not ideal as the flys
    anchored at the last node of the document may end up anchored to the
    newly inserted node and this one will be removed again a bit further on:
      GetNodes().Delete( aDelIdx, iDelNodes );
    
    ... which is what crashes, when the SwNodeIndex of the anchor is moved
    hard to a different node, which causes inconsistencies such as:
    
    sw/source/core/text/txtfrm.cxx:1263: TextFrameIndex SwTextFrame::MapModelToView(const SwTextNode*, sal_Int32) const: Assertion `static_cast<SwTextNode*>(const_cast<sw::BroadcastingModify*>(SwFrame::GetDep())) == pNode' failed.
    
    Instead, always use AppendTextNode() and then set the break item
    directly, which even simplifies the code.
    
    (reportedly a regression from 166b5010b402a41b192b1659093a25acf9065fd9
     although i wasn't able to find an earlier version that didn't crash
     in some way)
    
    Change-Id: I4cac74fc86fc505f62b14cf0d7a7f9689c7402ba
    Reviewed-on: https://gerrit.libreoffice.org/c/core/+/126921
    Tested-by: Jenkins
    Reviewed-by: Michael Stahl



Adding CC: to Michael Stahl
Comment 3 Julien Nabet 2022-04-02 09:10:02 UTC
Created attachment 179271 [details]
Flamegraph

On pc Debian x86-64 with master sources updated today (with no enable-dbgutil + gen rendering), I retrieved a Flamegraph.
Hope it may help.

I noticed that at the beginning it's very fast, then from 100 it begins to slow down, even more at 200 and from 300 it becomes quite slow.
Comment 4 Commit Notification 2022-06-03 13:09:53 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/ff525d0d70ea9d189a430bde944b56d048b03e55

tdf#148309 sw_redlinehide: fix mail merge performance regression

It will be available in 7.4.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 5 Commit Notification 2022-06-05 15:14:02 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "libreoffice-7-3":

https://git.libreoffice.org/core/commit/57cd4735a7312174e63d2a1a3dd3831443169530

tdf#148309 sw_redlinehide: fix mail merge performance regression

It will be available in 7.3.5.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 6 Gabor Kelemen (allotropia) 2022-08-11 09:41:27 UTC
Verified in

Version: 7.5.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: 4e2ce2a460458f17ee4360c45a2da2fc4b4d753e
CPU threads: 14; OS: Windows 10.0 Build 19044; UI render: Skia/Raster; VCL: win
Locale: en-US (hu_HU); UI: en-US
Calc: threaded

Generating a single merged document takes about 3 minutes now on a PC, with the following intervals:
0-200: 18s
200-400: 22s
400-600: 27s
600-800: 37s
800-1000: 47s

So it is still slightly going slower and slower as it progresses, but it's a lot better than the reported values.

However, finishing saving the generated file takes about 5:30 (it has a 16Mb uncompressed content.xml), and opening the generated file takes another 4 minutes, with constant 100% CPU use while just displaying it.
Comment 7 Gabor Kelemen (allotropia) 2022-08-11 10:26:26 UTC
Created attachment 181726 [details]
Generated merge file with 1000 letters

Opening this is actually getting better, just leaving some measurements here for the record:

4.0: 5:15
5:0: 9:45
6.0: 8:25
7.0: 5:20
7.5: 4:00