This is somewhat related to https://bugs.freedesktop.org/show_bug.cgi?id=36013 There was a problem that LO crashed when working with documents which had > 65535 style switches. The crash was fixed but the performance greatly degraded: attached is the 2MB file which was taking ~2-3 min to open on LO 3.3 and it takes 9 min to open on LO 3.4. Saving this document takes way more than an hour (!) (LO 3.3 was saving it in less than 10min). Just to close this document in LO 3.4 takes ~8 min (it's a Core 2 Duo 2.1GHz and I was using 64bit rpms: libreoffice3.4-base-3.4.0-12.x86_64 ...). On comparison Word 2007 on similar hardware takes about 50sec to open and less than 2 min to save. I must admit this document is not an average one the users work with but: a) it's a good test which should nicely show the slow parts of the code and which when optimized would bring speed benefits to smaller documents as well b) it's a regression so the problem should (hopefully) be easier to understand c) this document can be opened/saved withing reasonable time on other office software (on very similar hardware) so there should be more efficient algorithms available
@ Andriy Rysin: Sample document is missing. If your test kit axceeds Bugzilla’s file size limitation of 1 MB you can use our experimental temporary WIKi upload page <http://wiki.documentfoundation.org/QA/Bugzilla-Attachments> Please add additional information due to <http://wiki.documentfoundation.org/BugReport> (Operating System for tests only Linux?)!
I could not connect to wiki server so here's the test document: http://r2u.org.ua/data/tmp/big_file_test.odt
[Reproducible] with "LibreOffice 3.4.1RC1 - WIN7 Home Premium (64bit) German UI [OOO340m1 (Build:103)]" Opening the document takes several minutes (OOo 3.1.1 does it within 1/2 minute). Closing the document takes endless time, I terminated after 20 Minutes with app. 2% progress bar and "no response" message (OOo 3.1.1 crashed) A 3000 pages document is not normal use, but performance should be improved, and the crashes might be a serious problem. @Cédric Please feel free to reassign if it’s not your area or if provided information is not sufficient.
Created attachment 61367 [details] Add original test document in case it gets deleted upstream
Please find a (uncompressed 26mb) callgrind of importing perhaps the first 5% (before I killed it) http://sspaeth.de/uploads/tmp/callgrind.out.6154.bz2 It shows that 80% of the time is spent in SfxItemSet:Put (43k calls) implying 221 million calls to the != operator and 443 million calls to the iterator. It's interesting to see that 43k put calls lead to this immense amount of comparisons. I did not dare to run this callgrind to its full completion.
I was wandering if I could provide any help in solving this problem. I was working with a odt file recently (5Mb compressed) and LO took >8 hours to save it on Core(TM)2 Duo 2.10GHz.
Hi Matus; this one might be a good test document for profiling writer ? :-)
Probably not really a problem in Writer. I've run callgrind on this document (more than 5 hours) in my build with --enable-symbols - if anybody is interested in the output, let me know. But the point is that we spend ~all of the time in SfxItemPool::Put to be precise in: for (; itr != pItemArr->end(); ++itr) http://opengrok.libreoffice.org/xref/core/svl/source/items/itempool.cxx#774 SfxItemPool::Put is called only ~ 1 million times but probably pItemArr is quite big.
This bug lost its owner - back to NEW.
a quick try at loading and storing the file on Linux x86_64 gives these CPU times: 26:43 /data/inst/LO_3.4.6/opt/libreoffice3.4/program/soffice.bin 17:25 /data/inst/LO_3.5.7/opt/libreoffice3.5/program/soffice.bin 4:51 /data/inst/LO_4.1.4/opt/libreoffice4.1/program/soffice.bin --splash-pipe=5 i know what caused the improvement from 3.4 to 3.5: 7c7d5c0eec4efb95d18b735fb9df4754ba9d8b1f replaced SfxItemPool deque with vector, and should fix the performance regression vs. the 3.3 release (which is unable to load the bugdoc due to 16bit int overflows). deque iterators are surprisingly slow, so for such use-case vector is much better. why it's even faster in 4.1 is a mystery to me.
Created attachment 100598 [details] prototype speedup patch So - this is an interesting problem; there are several horribly intermittent performance problems here. It seems that the SfxItemPool management is extremely performance fragile. In particular for non-poolable items - if we happen to add and remove them in the right order (which sadly we do per element in response to SAX callbacks [!] ) - actually just for one type 'SwFmtCharFmt' - I assume because it's an SwClient sub-class then we can either be 'lucky' or not ;-) The basic problem is that if when we add our temporary element there is a hole at the front of the SfxPoolItemArra_Impl vector, then we will add it there, append the new element to the end of the array, and then remove the first element again [ thus forcing a re-scan of the whole array to find the next hole - which is the expensive bit ]. The attached patch uses an unordered_set for the SfxItemPool - that should have a linear cost for add/removes of non-poolable items - and indeed it significantly helps with performance. With this patch I get a 61 second load instead of a 290 second load; which is a win. Of course, inevitably it breaks a unit test as well so not to master. Perhaps there is a simpler / less invasive way.
Created attachment 100605 [details] a simpler attempt - that doesn't work This is a simpler attempt; unfortunately it doesn't work at all well - I gave up waiting. We try to solve half of the problem by speeding up inserting tracking spare space at beginning and/or end of the vector - but it doesn't work. Also - this can never accelerate 'remove' sufficiently so can only solve half the problem.
wonder if it would work to put in not just 2 free-indexes but a whole stack of them? could also shrink the array when the size of the stack is > half the array size. but really some sort of map is probably more appropriate for the problem. well some things like the RTF export will iterate over all items but probably that should still be fast enough. for the non-poolable items using the address as key ought to be sufficient; for the poolable ones it would require virtual hash function or operator< which would be quite some work to add everywhere.
> wonder if it would work to put in not just 2 free-indexes but a whole > stack of them? could also shrink the array when the size of the stack > is half the array size. Sure - my second patch or something like it would swallow the insertion time but do nothing for the removal time (sadly). My first patch axes that too. > but really some sort of map is probably more appropriate for the problem. Agreed. > well some things like the RTF export will iterate over all items but > probably that should still be fast enough. That should all be linear cost with a hash. > for the non-poolable items using the address as key ought to be sufficient; > for the poolable ones it would require virtual hash function or operator< > which would be quite some work to add everywhere. Right - it'd be nice to have a single container in the deeper future to reduce compares, but it is really not that much of the time in the poolable ones - often there is a tiny to small set of options; not several thousand like in the non-poolable case. Anyhow - I have a better, cleaner, updated patch I'm testing here based on patch #1, and an even better one re-factored into a prettier form [ SfxItemPool badly needs translating and the docs updating - hopefully this will help clarify it a little ] that's on the way.
Created attachment 100770 [details] a prettier layer on top Working onwards and up-wards, a prettier, virtualized pair of containers for either non-poolable or poolable types to suit the perf. characteristics of each. Lovely, etc. apart from the fact that it means updating the horrible binary file format stuff that is (nominally) used for cut/paste in some place or other cf. bug#79851 for dunging that out.
Michael Meeks committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=1ca7ac128dcdf03e3749af7458beac8d2da8a708 fdo#38513 - Accelerate non-poolable item add / remove. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Unfortunately until bug#79815 is fixed we have to waste a bit of space and time here to accomodate it; nevertheless I've pushed a patch to master - and pending seeing what the tinderboxen do before back-porting to 4.3 - that gives a 5x speedup; for me the document loads in 34s instead of 176s after/before. Michael - any chance of some review for 4.3 ? =)
Michael Meeks committed a patch related to this issue. It has been pushed to "libreoffice-4-3": http://cgit.freedesktop.org/libreoffice/core/commit/?id=8aff83b95fa5969edfc48022ddaae05031b178cf&h=libreoffice-4-3 fdo#38513 - Accelerate non-poolable item add / remove. It will be available in LibreOffice 4.3. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Created attachment 101912 [details] backtrace of the crash When trying to open the testfile with LO 4.3 with the commit, I get a crash by segmentation fault. Version: 4.3.0.1.0+ Build ID: 8aff83b95fa5969edfc48022ddaae05031b178cf Clean build on Ubuntu 14.04 x86-64 (make disclean > ./autogen.sh > make) Attached a bt but my build was done without symbols. Hope this helps. Best regards. JBF
I just tried LibreOffice 4.3 on many of my big documents with large number of style switches and it's flying both for saving and for closing the document (opening seesm to be faster as well). I would like to thank you guys for fixing this bug. Much appreciated!!!
Migrating Whiteboard tags to Keywords: (perf) [NinjaEdit]