Bug 79761 - autocorrect residual slow-down
Summary: autocorrect residual slow-down
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Linguistic (show other bugs)
Version:
(earliest affected)
4.4.0.0.alpha0+ Master
Hardware: All All
: medium normal
Assignee: d.sikeler94
URL: https://www.dropbox.com/s/v24vifucf1x...
Whiteboard: target:4.4.0
Keywords: regression
Depends on:
Blocks:
 
Reported: 2014-06-07 05:35 UTC by tommy27
Modified: 2017-07-20 04:28 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments
large autocorrect lists (1.93 MB, application/zip)
2014-06-07 05:35 UTC, tommy27
Details
SimpleProgramDebugger results 4.3beta2 vs 4.4alpha (20.76 KB, application/zip)
2014-06-09 22:34 UTC, tommy27
Details
Very Sleepy profiling results (139.06 KB, application/zip)
2014-06-15 13:42 UTC, tommy27
Details
callgrind log file (1.58 KB, application/gzip)
2014-06-23 06:14 UTC, Marina Latini (SUSE)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description tommy27 2014-06-07 05:35:05 UTC
Created attachment 100586 [details]
large autocorrect lists

.................
BACKGROUND
.................

Large autocorrect lists are known to cause transient LibO freeze while you type the first word of a document. Performance has been improved during years (see Bug 46805 and Bug 55570) and was stable till 4.2.4.2.

now I've noticed a consisted performance regression in 4.4.0.0.alpha0+
Build ID: dacab84536f61f79a5422e1d8d394019ec62f3ec
TinderBox: Win-x86@39, Branch:master, Time: 2014-06-04_04:27:18

...................
STEPS TO REPRODUCE
...................

- unzip attached file and extract the acor_und.dat and acor_it-IT.dat files (very large lists containing 191034 and 83027 entries) under the autocorr subfolder of the user profile (see: https://wiki.documentfoundation.org/UserProfile#Default_location)

- open a Writer new document

- set language as Italian (or rename the acor_it-IT.dat to match you default language localization i.e acor_en-GB.dat or acor_de-DE.dat)

- type test and hit space

...................
CURRENT BEHAVIOUR
...................

LibO freezes for a few seconds before you can actually see the cursor move 1 space forward and you are able to digit again.

LibO 4.2.4.2 --> 4.3 seconds

LibO 4.4.0.0alpha0+ --> 10.3 seconds

tests were done on a 5 years old laptop Win7x64 sp1 - Intel Core 2 Duo - 2.27 GHz - 4GB RAM - HDD

so time will probably be shorter with more powerful equipment, nevertheless this is a sensible performance regression in respect to 4.2.x
Comment 1 Michael Meeks 2014-06-07 05:59:26 UTC
To make any progress on this bug, we need a callgrind trace of this happening. Please get a build with symbols and then run:

unset MALLOC_CHECK_
unset MALLOC_PERTURB_
unset G_SLICE
export SAL_DISABLE_FLOATGRAB=1
export OOO_DISABLE_RECOVERY=1
valgrind --tool=callgrind --simulate-cache=yes --dump-instr=yes ./soffice.bin --splash-pipe=0 >& /tmp/callgrind-log.txt

gzip callgrind-log.txt

do your typing, then exit attaching callgrind-log.txt.gz to this bug, or up-load it.

Without that, its hard to fix. :-)
Comment 2 tommy27 2014-06-07 07:32:38 UTC
for the record I retested against LibO 4.3.0.0.beta2
Build ID: a06aa316117a6ff0f05c697c82831c227812d810

in 4.3beta2 the performance is almost the same as in LibO 4.2.4.2 hence the regression is limited to the 4.4.x branch.

regarding the trace I'm on Windows so I have to grab a 4.4.x master build with symbols. I took a look at this link: http://dev-builds.libreoffice.org/daily/master/Win-x86@39/symbols/

but it looks confusing to me what should I download.
Comment 3 Michael Meeks 2014-06-09 09:27:10 UTC
Hi Tommy, you can't run callgrind on Windows AFAIK, which is a shame of course, but ... ;-) I strongly suspect that the performance is the same of this piece on both platforms, and that profiling on Linux with great tools will show what's up.
Comment 4 tommy27 2014-06-09 22:34:32 UTC
Created attachment 100773 [details]
SimpleProgramDebugger results 4.3beta2 vs 4.4alpha

I tried to use SimpleProgramDebugger ( http://www.nirsoft.net/utils/simple_program_debugger.html ) and attached to the soffice.bin process while reproducing the issue in 4.3beta2 and 4.4alpha

I don't know if those files can be useful to debug or not.
can you spot a difference between the 4.3 and 4.4 results?
Comment 5 tommy27 2014-06-15 13:42:45 UTC
Created attachment 101097 [details]
Very Sleepy profiling results

another try...
used this Windows profiler Very Sleepy 0.82 ( http://www.codersnotes.com/sleepy ) and saved it's results either in .sleepy extension or .csv format

test was done attaching Very Sleepy to soffice.bin and profiling the "4668" Thread, Location "GetMessageW"
Comment 6 Michael Meeks 2014-06-16 14:19:49 UTC
Sadly the 'very sleepy' output is impenetrable compared to callgrind -> kcachegrind visualisation that points directly at the problem =) I could make nothing of the results there I'm afraid.

Grabbing a VirtualBox and running Linux inside it so you can run callgrind would work well.
Comment 7 tommy27 2014-06-17 12:24:24 UTC
sorry that Very Sleepy was useless.
I revert status to UNCONFIRMED since it has not been yet reproduced by other users. hopefully a Linux user could try profiling with callgrind
Comment 8 Marina Latini (SUSE) 2014-06-23 06:14:24 UTC
Created attachment 101555 [details]
callgrind log file

Callgrind log file
Comment 9 Marina Latini (SUSE) 2014-06-23 06:16:33 UTC
Confirmed on:

* Version: 4.4.0.0.alpha0+
* Build ID: 113866583c5a34b9242aeb9040c4ce80e65da1ba
* OS: Ubuntu 14.04 x86_64
-------------

Set platform and OS to All --> Reported on Windows x86 and confirmed on Ubuntu 14.04 x86_64.

Attached a valgrind output with options and variables set like in comment 1.
Comment 10 tommy27 2014-06-23 06:55:04 UTC
@Marina Latina
thank you very much!!!


@Micheal Meeks
I hope that output will help you finding the cause of the issue.

I copy here a comment you wrote about Bug 55570

>Michael Meeks  2012-12-10 20:43:53 UTC 
>Fixed in master; will pick to -4-0 shortly - thanks for the report :-)
>~all the remaining time in the profile is XML parsing - for future work we >should prolly switch to use the fast-parser instead, to accelerate that, and >look for more simple cases that we can optimise there.

is this speed-up idea still valid?

moreover I'm curious to know - if it's possible - why these performance sometimes regresses during master developement...

the original autocorrect slowdown issue inherited from OOo was Bug 46805 that you fixed, then regression Bug 55570 appeared in 3.7/4.0 master era and you fixed that as well, but now it's back in 4.4 master (4.3.0 beta2 was still free of it like 4.2.x branch)
Comment 11 Michael Meeks 2014-06-23 09:21:57 UTC
> moreover I'm curious to know - if it's possible - why these performance
> sometimes regresses during master developement...

At a low level - it's hard to say why this specific regression happened until we look into it. From a high level, judging the performance impact of any given change on the rest of the 8 million lines of code is ~impossible. Speeding one thing up, can make another thing slower quite easily; adding an unrelated feature can cause a significant performance regression etc.

To try to catch this we have automated performance regression tests which generate callgrind data. These don't capture auto-correct performance though.

Marina ... thanks for running the profile; I'm afraid my instructions were wrong ... [ sorry ! ].

The callgrind results will be saved in a callgrind.11541.out file - can you gzip that and attach it ? - hopefully its still there. Since that has the real data to fix this with !.
Comment 12 Marina Latini (SUSE) 2014-06-23 10:01:06 UTC
(In reply to comment #11)
> 
> Marina ... thanks for running the profile; I'm afraid my instructions were
> wrong ... [ sorry ! ].
> 
> The callgrind results will be saved in a callgrind.11541.out file - can you
> gzip that and attach it ? - hopefully its still there. Since that has the
> real data to fix this with !.

It's my first attempt with valgrind and the man pages are very big to read in few minutes. I done a rapid copy&paste from your comment so, Michael, no problem! ;)

I recap my steps:

1) Used the sources https://bitbucket.org/vmiklos/lo-daily (are they ok also for bibisect?)
2) Started LibreOffice for creating the fresh user profile with:
 ./opt/program/soffice.bin -env:UserInstallation=file:///tmp/soffice-bisect-$(git rev-parse HEAD) 
3) Closed LibreOffice
4) Copied the .dat files in the autocorrect folder
5) Setted the variables:
unset MALLOC_CHECK_
unset MALLOC_PERTURB_
unset G_SLICE
export SAL_DISABLE_FLOATGRAB=1
export OOO_DISABLE_RECOVERY=1
6) started valgrind with:
valgrind --tool=callgrind --simulate-cache=yes --dump-instr=yes ./opt/program/soffice.bin -env:UserInstallation=file:///tmp/soffice-bisect-$(git rev-parse HEAD) --splash-pipe=0 >& /tmp/callgrind-log.txt


The out file is too big (18MB) so here you can find the gzip file uploaded on dropbox
https://www.dropbox.com/s/v24vifucf1x1ztq/callgrind.out.11541.gz
Comment 13 Michael Meeks 2014-06-23 11:00:57 UTC
Unfortunately the trace (thanks for transferring it) is done on a LibreOffice without debugging information [ though  I see you're helping out with bisects which is cool ! =], so it's not that useful.

Getting an older version of LibreOffice (or any that exhibit the bug) with debuginfo installed is really a pre-requisite here; that's what I mean by a  'build with symbols'

Thanks !
Comment 14 tommy27 2014-06-23 11:12:16 UTC
is the Linux debug version the one listed as Linux-rpm_deb-x86@45-TDF-dbg/ ?

at this iink: http://dev-builds.libreoffice.org/daily/master/Linux-rpm_deb-x86@45-TDF-dbg/
Comment 15 Michael Meeks 2014-06-23 11:16:18 UTC
Not certain; but sounds plausible =)
Comment 16 tommy27 2014-06-27 18:22:46 UTC
good news.

I've retested using : 4.4.0.0.alpha0+
Build ID: 95272e7e5b7e38753ab07dbd6503b7cfa2974842
TinderBox: Win-x86@42, Branch:master, Time: 2014-06-26_23:01:43

under the same PC and test environment the autocorrect slow-down of 4.4 master is back to the same lenght of the 4.2.x branch, which is 4.3 seconds.

so the performance regression I saw in early june is now gone in recent builds.

I set status as RESOLVED WORKSFORME however any effort (profiling, coding etc. etc.) to improve the current situation (which is annoying in older machines) would be appreciated as well
Comment 17 tommy27 2014-07-12 05:26:43 UTC
bad news... slow-down is back in recent build (*) and is even worse than before...

now the delay is 35 seconds and makes Writer experience unacceptable...
I'm adding this to mab4.4 list

(*) 4.4.0.0.alpha0+
Build ID: d5dd1216804afae35d7fe7dbb1d37b0ca1fcce88
TinderBox: Win-x86@39, Branch:master, Time: 2014-07-11_00:09:47
Comment 18 Michael Meeks 2014-07-12 10:15:43 UTC
So - again if someone has a build with debugging symbols (but -not- debugutil - ie. the linux tinderbox debug build is not useful) - then they can produce a callgrind trace that immediately points at the problem: which is prolly really silly =)
Comment 19 Michael Meeks 2014-07-12 11:20:42 UTC
Matus - I'd love to have some input from you on writing performance tests that are things that are not startup / load / save =)

Is there some set of unit tests we could create and/or a makefile rule that we could hook into to create a series of such tests we can run ?

The large auto-correct database is a perennial problem to keep at-speed =)

Thanks !
Comment 20 Matúš Kukan 2014-07-15 07:27:30 UTC
I can't reproduce this bug (commit e723ee8622447300a1aaeca24df3f35f3dcc9368).
For me it's almost the same as in 4.3, the time and calgrind profile too.
If you are interested, you can find it here:
https://owncloud.documentfoundation.org/public.php?service=files&t=1c09a1a653cad5da24f420d56f94e49a&download

Anyhow, working on a callgrind performance test for this.
Comment 21 tommy27 2014-07-15 07:48:18 UTC
thanks Matus.

your profiling will probably help improve the situation in 4.3.x which is still suboptimal (that delay is quite annyoing on older machines)

regarding the 4.4.x performance I'll later test a newer daily build since I've already experienced transient performance decrease in a build and subsequent return to normality in another (see comment 16)
Comment 22 tommy27 2014-07-15 20:50:42 UTC
retested with 4.4.0.0.alpha0+
Build ID: abc28ffc04067eb24840fbf564c311aaee10f84d
TinderBox: Win-x86@42, Branch:master, Time: 2014-07-15_07:20:05

performance is again back to 4.2.x level
so I mark again this as RESOLVED

however any improvement to current delay thanks to profiling results would be very appreciated
Comment 23 tommy27 2014-08-20 07:57:50 UTC
@Micheal Meeks
have you ever had time to inspect the profiling results that Matus obtained in comment 20. is there anything you can tweak to improve current performance?
Comment 24 Michael Meeks 2014-08-20 09:26:58 UTC
Hi Tommy - unfortunately, we need a good callgrind profile (with symbols) of when this issue really takes forages ;-) the 8bn pcycles that matus shows is quite a lot of time of course.

A chunk of the time there is raw / pure XML parsing - we have 2.5bn cycles inside the expat library - which we could almost entirely loose by using the XFastParser and doing that in its own thread ;-)

David Ostrovsky kindly agreed to take a look at doing something there I think - I forget what the bug# was though =) any news there David ? Of course re-using the XFastParser is always a good idea since that's the code path we optimize for document load too =)

Then again - looking at the data here, it seems to me like a big chunk of the problem is:

SwXMLTextBlocks::SwXMLTextBlocks

Which -seems- to init an SfxObjectShell each time - taking around 2bn cycles.

That looks massively expensive to do inside every start_Element for the XML file ;-)

Why does SwAutoCorrect::GetLongText do that I wonder ?

re-opening, since that really looks quite odd.
Comment 25 Commit Notification 2014-08-22 15:20:49 UTC
Michael Meeks committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=8d6697587776136f3121733e1c29d4200720dbd9

fdo#79761 - avoid re-canonicalising each font name on clone repeatedly.



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 26 tommy27 2014-08-23 07:10:12 UTC
Hi Micheal,
just retested with 4.4.0.0.alpha0+
Build ID: f4246fab77113147b36706a1f3d93e8724ff826b
TinderBox: Win-x86@42, Branch:master, Time: 2014-08-23_03:27:09

unfortunately the performance of master seems the same as before...
I still see the same 4.3 seconds delay that I see in LibO 4.2.x and 4.3.x

according to the build time your patch should be in the daily I tested, so it seems that your tweak sadly had no effect on it.

as said in previous comment 22 the performance of masters is the same of previous branches and the worsening I saw in comment 0 and comment 17 were only transient episodes that I did not see in following dailies.

so I'm wondering if it's better to edit summary notes with something like:
"autocorrect residual slow-down" rather than the current "autocorrect slow-down got worse"
Comment 27 tommy27 2014-08-25 14:25:50 UTC
(In reply to comment #26)
> ...
> 
> as said in previous comment 22 the performance of masters is the same of
> previous branches and the worsening I saw in comment 0 and comment 17 were
> only transient episodes that I did not see in following dailies.
> 
> so I'm wondering if it's better to edit summary notes with something like:
> "autocorrect residual slow-down" rather than the current "autocorrect
> slow-down got worse"

did that, and removed from mab4.4 list since this is not specific to master branch.
Comment 28 Michael Meeks 2014-10-31 11:44:40 UTC
Daniel did a great job of switching to the XFastParser - so now the XML parsing should come ~free here, cf. bug#80403 which is nearly closed for 4.4

Of course, there is still this odd performance problem I highlight in comment #24 around SwXMLTextBlocks that I suspect would be good to dig into to try to make this more efficient. Any cycles for that Daniel ?
Comment 29 d.sikeler94 2014-10-31 12:41:46 UTC
(In reply to Michael Meeks from comment #28)
> Of course, there is still this odd performance problem I highlight in
> comment #24 around SwXMLTextBlocks that I suspect would be good to dig into
> to try to make this more efficient. Any cycles for that Daniel ?

I'm working on that. Each time we init SwXMLTextBlocks, we parse BlockList.xml. Which is about 170 times with the attached autocorrect lists, although it isn't changed.
Comment 30 Commit Notification 2014-11-05 11:51:20 UTC
Daniel Sikeler committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=4404b718bdb547cb9b7b17c73a53574724cdeeb7

fdo#79761: parse BlockList.xml only once

It will be available in 4.4.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 31 tommy27 2014-11-06 05:07:52 UTC
retested under Win7x64

I compared 4.3.2.2 and 4.4.0.0.alpha1+
Build ID: d59b9b4af36148e4d8df8f3e3492116d378642e2
TinderBox: Win-x86@51-TDF, Branch:MASTER, Time: 2014-11-05_22:58:24

starting autocorrect options
LO 4.3.2.2 --> 5 seconds
LO 4.0.0.0 --> 3 seconds

typing first word and press space

LO 4.3.2.2 --> 4.3 seconds
LO 4.0.0.0 --> 1.5 seconds

I would call this RESOLVED FIXED but I let the honour to do it to Daniel who did this amazing work and considerably improved LibO autocorrect performance.

remember that these times are obtained on an old laptop... if you work with new machines the delay will be even shorter... anyway I think it was good to fix and improve performance for people on older machines.

thumbs up Daniel!!! maybe there's still some room for improvement?
current result are however already. very satisfying
Comment 32 tommy27 2014-11-06 05:26:08 UTC
(In reply to tommy27 from comment #31)
> ....
> 
> starting autocorrect options
> LO 4.3.2.2 --> 5 seconds
> LO 4.0.0.0 --> 3 seconds
> 
> 

for the record, this was an old OOo issue at:
https://issues.apache.org/ooo/show_bug.cgi?id=101726

at that time OOo and LibO 3.5.0 took *15 minutes* to load a 65K entries autocorrect list... time was brought down to just 7 seconds by LibO volunteer developer Deszi Szabolcs in LibO 3.5.1

now after many fixes and code improvement trough the years, including the latest fix by Daniel, LibO can now load even larger lists in less time...

the attachment 100586 [details] contains 2 large list, respecitively:
- acor_it-IT -->  83027 entries --> loaded in 3.0 seconds
- acor_und   --> 191034 entries --> loaded in 5.4 seconds 

thank to all the developers who made these achievement possible.
I'm proud of being part of this free software community.
Comment 33 Mike §chinagl 2014-12-20 23:10:38 UTC
This bug fix comes with LibreOffice 4.4 (release notes https://wiki.documentfoundation.org/ReleaseNotes/4.4) 

Loading large autocorrect lists is at least twice as fast now, because the lists are parsed in a thread by the fastparser and repeated parsing of unchanged files is avoided.