Sanitize documents using "Find & Replace" getting slower and slower
Steps to Reproduce:
1. Open the attached file
2. Disable the automatic spell checking if enabled
In the Search for field put "." (a single period)
In the Replace with field put "x"
Make sure Whole words only is unchecked
Click Other Options to expand the dialog, and make sure Regular expressions is checked
Click Replace All
5 seconds with LibO 22.214.171.124
12 seconds with LibO 126.96.36.199
20 seconds with LibO 188.8.131.52 (and dialog acts as if LibO won't react)
Difference between 5.2.5. and 184.108.40.206 might be Harfbuzz common layout
Something like 220.127.116.11 would be ideal. At least some improvement
User Profile Reset: No
Build ID: c0fdcece6b886912618deee9656cb2d169a9b999
CPU threads: 4; OS: Windows 6.3; UI render: default;
TinderBox: Win-x86@42, Branch:master, Time: 2018-08-12_00:35:45
Locale: en-US (nl_NL); Calc: CL
Build ID: 78223678b7513ffe46804cb08f2dc5bc899b2bab
CPU Threads: 4; OS Version: Windows 6.29; UI Render: default;
Locale: nl-NL (nl_NL); Calc: CL
but not in
Build ID: f3153a8b245191196a4b6b9abd1d0da16eead600
Created attachment 144181 [details]
Created attachment 144182 [details]
Created attachment 144200 [details]
Bisected to following range
Repro slowness 6.2 vs. 3.6.
Adding Cc: to Miklos Vajna
so he can comment on the blamed commit
At some point I tried to improve search so it finds strings not only in Writer text, but also in Writer shape texts. I guess it's probable that the slowness comes from that.
Seems the current situation is the worst of both worlds: replace all still doesn't replace strings from shape text, but already slows down replace all.
Getting back the old performance or being more correct (include shape text when you do replace-all) would be good.
I had a look at this: just reverting the above mentioned commit doesn't change anything. I get why it doesn't help: the commit changes how documents containing shapes behave and the bugdoc doesn't have shapes.
I also profiled this: SvxSearchDialog::CommandHdl_Impl() in svx is the UI code, you can measure the cost of ExecuteSynchron():
- 97% is spent in SwFindParaText::DoFind() (OK)
- 55% of that is sw::ReplaceImpl
- 40% of that is sw::FindTextImpl
- here 20% is the SvxSearchItem ctor, which is already moved out of the loop for perf reasons
- 18% is DoSearch
So in short, nothing obviously stupid. According to the above numbers, it indeed smells like a regression, but I doubt it's from the above commit.
A flame graph would nice. And maybe retry of the bibisect on Linux; the range isn't to helpful, I guess
Looks bit like a Noel type of perf bug.. or I'm a bit to optimistic :-)
For me, it is 5,62 seconds with the last commit in Linux 50max and 10,54 in the first commit.
I don't know what I should bibisect :( I don't have newer repos than 50max on Linux.
Miklos already profiled this in comment 6, so no need for me to do anything.
Created attachment 151247 [details]
Flame Graph Win
Build ID: 128145e227ef91fb2f23893e73d38ae72cf074e5
CPU threads: 2; OS: Windows 6.3; UI render: default; VCL: win;
Locale: nl-NL (nl_NL); UI-Language: en-US
Another suggestion to make find & replace even worse:
Enabled change tracking and turn show changes on before running find & replace (bug 121618 comment 6)
Forgot to change from regression to implementation error when I commented that the problem did not start at the bisected commit after all.