Description: Sanitize documents using "Find & Replace" getting slower and slower Steps to Reproduce: 1. Open the attached file 2. Disable the automatic spell checking if enabled In the Search for field put "." (a single period) In the Replace with field put "x" Make sure Whole words only is unchecked Click Other Options to expand the dialog, and make sure Regular expressions is checked Click Replace All Actual Results: 5 seconds with LibO 4.4.7.2 12 seconds with LibO 5.2.5.0 20 seconds with LibO 6.2.0.0 (and dialog acts as if LibO won't react) Difference between 5.2.5. and 6.2.0.0 might be Harfbuzz common layout Expected Results: Something like 4.4.7.2 would be ideal. At least some improvement Reproducible: Always User Profile Reset: No Additional Info: Found in Version: 6.2.0.0.alpha0+ Build ID: c0fdcece6b886912618deee9656cb2d169a9b999 CPU threads: 4; OS: Windows 6.3; UI render: default; TinderBox: Win-x86@42, Branch:master, Time: 2018-08-12_00:35:45 Locale: en-US (nl_NL); Calc: CL and in Version: 5.2.5.0.0+ Build ID: 78223678b7513ffe46804cb08f2dc5bc899b2bab CPU Threads: 4; OS Version: Windows 6.29; UI Render: default; Locale: nl-NL (nl_NL); Calc: CL but not in Versie: 4.4.7.2 Build ID: f3153a8b245191196a4b6b9abd1d0da16eead600 Locale: nl_NL
Created attachment 144181 [details] Example file
Created attachment 144182 [details] Screenshot 'hang'
Created attachment 144200 [details] Bibisect log Bisected to following range https://cgit.freedesktop.org/libreoffice/core/log/?id=97b643ac862d0d6d0a305a7fd82fd20dfb3dbf72&qt=range&q=3828a18f5f4134daea616121b6df5f06b463e3bb Maybe https://cgit.freedesktop.org/libreoffice/core/commit/?id=995ecbdc903223aca7c4198c6acecbba0500a955
Repro slowness 6.2 vs. 3.6. Adding Cc: to Miklos Vajna so he can comment on the blamed commit
At some point I tried to improve search so it finds strings not only in Writer text, but also in Writer shape texts. I guess it's probable that the slowness comes from that. Seems the current situation is the worst of both worlds: replace all still doesn't replace strings from shape text, but already slows down replace all. Getting back the old performance or being more correct (include shape text when you do replace-all) would be good.
I had a look at this: just reverting the above mentioned commit doesn't change anything. I get why it doesn't help: the commit changes how documents containing shapes behave and the bugdoc doesn't have shapes. I also profiled this: SvxSearchDialog::CommandHdl_Impl() in svx is the UI code, you can measure the cost of ExecuteSynchron(): - 97% is spent in SwFindParaText::DoFind() (OK) - 55% of that is sw::ReplaceImpl - 40% of that is sw::FindTextImpl - here 20% is the SvxSearchItem ctor, which is already moved out of the loop for perf reasons - 18% is DoSearch So in short, nothing obviously stupid. According to the above numbers, it indeed smells like a regression, but I doubt it's from the above commit.
@Buovjaga A flame graph would nice. And maybe retry of the bibisect on Linux; the range isn't to helpful, I guess Looks bit like a Noel type of perf bug.. or I'm a bit to optimistic :-)
For me, it is 5,62 seconds with the last commit in Linux 50max and 10,54 in the first commit. 44max: last: 10,25 first: 10,27 I don't know what I should bibisect :( I don't have newer repos than 50max on Linux. Miklos already profiled this in comment 6, so no need for me to do anything.
Created attachment 151247 [details] Flame Graph Win Version: 6.3.0.0.alpha0+ Build ID: 128145e227ef91fb2f23893e73d38ae72cf074e5 CPU threads: 2; OS: Windows 6.3; UI render: default; VCL: win; Locale: nl-NL (nl_NL); UI-Language: en-US Calc: threaded
Another suggestion to make find & replace even worse: Enabled change tracking and turn show changes on before running find & replace (bug 121618 comment 6)
Forgot to change from regression to implementation error when I commented that the problem did not start at the bisected commit after all.
*** Bug 136436 has been marked as a duplicate of this bug. ***
@Noel In the case you're doing some perf magic again; a suggestion in advance
I have fixed one part of this https://gerrit.libreoffice.org/c/core/+/102231 There are two more (which I don't intend to tackle) (1) We repeatedly revalidate the paragraph signature. Since we're making changes to individual words/letters, this is effectively O(n^2). A smarter thing to do would be to only do this validation after the replace operation. (2) We create a ton of small UNDO objects. Again, a smarter thing to do would be to create a single UNDO operation. The first one is tricky because it uses the UNO API to inspect the document, while the rest of the code is using the internal raw API. The second one is tricky because there are loops at multiple levels of the call stack, and it's not clear how the responsibilities are shared across the different calls.
I would call a 50% reduction sounds already quite an improvement
Noel Grandin committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/1159f58831c69680e9f10767d5358e13b66579dd tdf#119286 speed up find/replace It will be available in 7.1.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
in my case, times after the patch is 23 seconds, before the patch, it is 42 seconds. Nicee!! @Noel, thanks for fixing this issue. I'll create a follow-up enhancement with the info from comment 14
(In reply to Xisco Faulí from comment #17) > in my case, times after the patch is 23 seconds, before the patch, it is 42 > seconds. Nicee!! > > @Noel, thanks for fixing this issue. I'll create a follow-up enhancement > with the info from comment 14 I already created them and added to See also earlier.
Noel Grandin committed a patch related to this issue. It has been pushed to "libreoffice-7-0": https://git.libreoffice.org/core/commit/ee05abb5a41ed3fc898963dda06675ddbef8dbf5 tdf#119286 speed up find/replace It will be available in 7.0.2. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.