Created attachment 49025 [details] shorter 50 page docx document When opening a 50 page docx file it takes approximately 2 minutes to open and a 119 page docx file takes approximately 6 minutes to open. Also, the cpu is functioning at 100% for the full duration of the file opening process. Steps to reproduce: Just open the attached 50 page file. After the time lapse the file does open without any problems. I first encountered the bug in LO 3.3.2 Build:202) on Pinguy OS. I have tested this bug in LO 3.3.3; 3.4.0 and 3.4.1 in Windows XP as well as in Ubuntu. Furthermore, I have also tested this bug on 2 other computers using LO 3.3.3 and 3.4.1. Lastly, this bug does not seem to be dependent on the size of the document since a docx document with lots of pictures in it and with a size of 1.5 mb opened up very quickly. It therefore seems that this problem relates to the length of the document.
Created attachment 49026 [details] 119 page docx file This file should take around 6 minutes to open
Created attachment 49027 [details] 1.5 mb docx file which opens quickly
Reproduced on LibreOffice 3.4 340m1(Build:103) for OpenSuse Linux. Took me 2 minutes to open the first attached document. Other two documents are saved as text/plain files. I cannot seem to open them.
Created attachment 49032 [details] long 119 page docx file
Created attachment 49033 [details] Large 1.5mb docx file
(In reply to comment #3) > Reproduced on LibreOffice 3.4 340m1(Build:103) for OpenSuse Linux. > > Took me 2 minutes to open the first attached document. Other two documents are > saved as text/plain files. I cannot seem to open them. Thank you Jeffrey. I re-attached the long and large files. Sorry, first ever bug I reported.
Writer only takes so long to open a docx file, if the file has footnotes in it.
Problem tested and confirmed with LibreOffice 3.4.3 (OOO340m1 (Build:302)) running on MacOS X 10.6.8, German, on a MacBook Pro with 2.53 GHz Intel Core i5, 4 GB RAM and Solid State Hard Drive. I mention the hardware data because the times LibreOffice needs to open your documents are a bit shorter for me, but they are still far too long. In detail: 1) The "large 1.5 mb docx file which opens quickly" (AJ Marais Research Proposal.docx) opens in less than 5 seconds for me -- fine! (But note that when closing the document again, just so, without any changes etc., LibreOffice crashes -- this is Bug 39510, I think.) 2) The "shorter 50 page docx document" (Chapter 3 draft 1.docx) needs about 60 seconds to open, while LibreOffice uses > 95% of the CPU all the time (measured with the top command in the shell). Far too long. 3) The "long 119 page docx file" (Volledige dokument.docx) needs about 3 minutes and 5 seconds to open, while LibreOffice uses > 97% of the CPU all the time (again measured with the top command in the shell). Far too long again.
(In reply to comment #8) > Problem tested and confirmed with LibreOffice 3.4.3 (OOO340m1 (Build:302)) > running on MacOS X 10.6.8, German, on a MacBook Pro with 2.53 GHz Intel Core > i5, 4 GB RAM and Solid State Hard Drive. I mention the hardware data because > the times LibreOffice needs to open your documents are a bit shorter for me, > but they are still far too long. In detail: > > 1) The "large 1.5 mb docx file which opens quickly" (AJ Marais Research > Proposal.docx) opens in less than 5 seconds for me -- fine! (But note that when > closing the document again, just so, without any changes etc., LibreOffice > crashes -- this is Bug 39510, I think.) > > 2) The "shorter 50 page docx document" (Chapter 3 draft 1.docx) needs about 60 > seconds to open, while LibreOffice uses > 95% of the CPU all the time (measured > with the top command in the shell). Far too long. > > 3) The "long 119 page docx file" (Volledige dokument.docx) needs about 3 > minutes and 5 seconds to open, while LibreOffice uses > 97% of the CPU all the > time (again measured with the top command in the shell). Far too long again. Thank you, I have commented at bug 39510 that these two bugs could be related to that one another.
[This is an automated message.] This bug was filed before the changes to Bugzilla on 2011-10-16. Thus it started right out as NEW without ever being explicitly confirmed. The bug is changed to state NEEDINFO for this reason. To move this bug from NEEDINFO back to NEW please check if the bug still persists with the 3.5.0 beta1 or beta2 prereleases. Details on how to test the 3.5.0 beta1 can be found at: http://wiki.documentfoundation.org/QA/BugHunting_Session_3.5.0.-1 more detail on this bulk operation: http://nabble.documentfoundation.org/RFC-Operation-Spamzilla-tp3607474p3607474.html
reproduced on LibO 3.5.0 beta 1 on Fedora 64 bit and on LibO 3.4.3 on Windows XP 32 bit
Acknowledgement: Problem is still present in LibreOffice 3.4.4 and 3.5beta2 running on MacOS X 10.6.8 German.
On pc Debian x86-64 with master sources updated today, I reproduced the problem. There are a lot of logs like these: warn:writerfilter:22852:1:/home/julien/compile-libreoffice/libo/writerfilter/source/dmapper/DomainMapper.cxx:1451: DomainMapper::attribute() - Id: 92505 / 0x16959 value: 299 / 0x12b warn:writerfilter:22852:1:/home/julien/compile-libreoffice/libo/writerfilter/source/dmapper/DomainMapper.cxx:1451: DomainMapper::attribute() - Id: 92505 / 0x16959 value: 300 / 0x12c warn:writerfilter:22852:1:/home/julien/compile-libreoffice/libo/writerfilter/source/dmapper/DomainMapper.cxx:1451: DomainMapper::attribute() - Id: 92505 / 0x16959 value: 300 / 0x12c etc. Miklos: one for you?
Julien, It makes little sense to reproduce a performance problem with a debug-enabled build, which has optimization disabled. :-) Anyway, those warnings are just about some ignored properties, once those are fixed, the speed will be a bit more slow. But sure, I'll try to take a look as time permits. Miklos
(In reply to comment #14) > It makes little sense to reproduce a performance problem with a > debug-enabled build, which has optimization disabled. :-) If the goal is to compare time, I agree with you of course :-) Now it's quite a small doc and after a minute, it wasn't opened (I'm on i5 with 6Gb and had only Iceweasel and Icedove opened in addition with console and LO) My goal was to retrieve maximum info. I'll try to retrieve bts at random tonight after my day time job
Nasty; is it a regression ? [ if so it'd be nice to have the keyword ;-] can we get a callgrind trace with debug symbols installed: export OOO_EXIT_POST_STARTUP=1 export OOO_DISABLE_RECOVERY=1 valgrind --tool=callgrind --simulate-cache=yes --dump-instr=yes ./soffice.bin --splash-pipe=0 /path/to/file and leave it overnight ;-) gzip / up-load the callgrind.12345.txt file and (hopefully) the answer will just drop out of that. Thanks !
Michael: tried to follow your comment but after the 2 exports you quoted I runned this: valgrind --tool=callgrind --simulate-cache=yes --dump-instr=yes ./soffice.bin --splash-pipe=0 ~/compile-libreoffice/bugs/39179/callgrind.txt then got this: 10% 20% 25% 30% 35% 40% 50% 55% 60% 75% 80% 100% end + a popup saying calgrind.txt doesn't exist then: ==28753== Events : Ir Dr Dw I1mr D1mr D1mw ILmr DLmr DLmw ==28753== Collected : 4357516741 1343547732 904497078 53990330 10823051 1763992 305982 1830857 823690 ==28753== ==28753== I refs: 4,357,516,741 ==28753== I1 misses: 53,990,330 ==28753== LLi misses: 305,982 ==28753== I1 miss rate: 1.23% ==28753== LLi miss rate: 0.0% and other numbers. I made the test again this time, after having creating an empty "callgrind.txt" file but I got the same. What did I miss? (Valgrind Version: 1:3.7.0-6 on pc Debian testing x86-64 updated today)
Created attachment 69376 [details] some bts at random I attached some bts taken at random
Hi Julien, valgrind --tool=callgrind --simulate-cache=yes --dump-instr=yes ./soffice.bin --splash-pipe=0 ~/compile-libreoffice/bugs/39179/callgrind.txt So - the command-line you're running after ./soffice.bin is just the same as if you're launching the app itself; so you need to pass the path to the test file you want to load to provoke the slowness :-) so /path/to/docx/file is what I should have written. Any chance of another try ? :-)
(In reply to comment #19) > Hi Julien, > > valgrind --tool=callgrind --simulate-cache=yes --dump-instr=yes > ./soffice.bin --splash-pipe=0 ~/compile-libreoffice/bugs/39179/callgrind.txt > > So - the command-line you're running after ./soffice.bin is just the same as > if you're launching the app itself; so you need to pass the path to the test > file you want to load to provoke the slowness :-) > > so /path/to/docx/file is what I should have written. > > Any chance of another try ? :-) Sorry for this, I thought it was like memcheck process and it was the output file. Will give a try as soon as building of the day be finished
I put the file available here: http://serval2412.free.fr/callgrind.out.fdo39179.bz2 (I tried with http://wiki.documentfoundation.org/QA/Bugzilla-Attachments but it doesn't seem to accept more than 5MB, so gave up)
Thanks - firing it up in kcachegrind doesn't show any staggeringly obvious single bottleneck. I suspect that the 2+ million calls to: writerfilter::ooxml::OOXMLFactory::createFastChildContextFromFactory and friends are near the core of the issue - we're spending far too much time per attribute by a -biiig- margin as we parse (I think) - and (I guess) there are a ton of attributes here. OTOH - that will require a fair bit of work digging out ways to improve there I suspect, and there is the XSLT monster to kill there too :-)
still slow opening of 50 page test file. almost 2 minute with LibO 4.1.1.2 under Win7 64bit
I updated my local master repo and had the same. I noticed lots of logs like this: warn:writerfilter:18467:1:writerfilter/source/dmapper/DomainMapper.cxx:1479: DomainMapper::attribute() - Id: 92521 / 0x16969 value: 394 / 0x18a warn:writerfilter:18467:1:writerfilter/source/dmapper/DomainMapper.cxx:1479: DomainMapper::attribute() - Id: 92521 / 0x16969 value: 395 / 0x18b warn:writerfilter:18467:1:writerfilter/source/dmapper/DomainMapper.cxx:1479: DomainMapper::attribute() - Id: 92521 / 0x16969 value: 396 / 0x18c warn:writerfilter:18467:1:writerfilter/source/dmapper/DomainMapper.cxx:1479: DomainMapper::attribute() - Id: 92521 / 0x16969 value: 397 / 0x18d warn:writerfilter:18467:1:writerfilter/source/dmapper/DomainMapper.cxx:1479: DomainMapper::attribute() - Id: 92521 / 0x16969 value: 398 / 0x18e warn:writerfilter:18467:1:writerfilter/source/dmapper/DomainMapper.cxx:1479: DomainMapper::attribute() - Id: 92521 / 0x16969 value: 399 / 0x18f warn:writerfilter:18467:1:writerfilter/source/dmapper/DomainMapper.cxx:1479: DomainMapper::attribute() - Id: 92521 / 0x16969 value: 400 / 0x190 warn:writerfilter:18467:1:writerfilter/source/dmapper/DomainMapper.cxx:1479: DomainMapper::attribute() - Id: 92521 / 0x16969 value: 401 / 0x191 warn:writerfilter:18467:1:writerfilter/source/dmapper/DomainMapper.cxx:1479: DomainMapper::attribute() - Id: 92521 / 0x16969 value: 402 / 0x192 warn:writerfilter:18467:1:writerfilter/source/dmapper/DomainMapper.cxx:1479: DomainMapper::attribute() - Id: 92521 / 0x16969 value: 402 / 0x192
That is about unhandled tokens, not related to performance. If we want to improve performance, we should first profile.
Regression sometime before the most recent Ubuntu LTS in 2012 -- the slow documents are still slow as ever, but the 'fast' document is *also* now very slow to open. Sigh. LibreOffice 3.5.7 on Ubuntu 12.04.3 LTS 25sec Large 1.5mb docx [was 5sec for previous commenter but crashed] 50sec shorter 50 page docx [was 60sec for previous commenter] 118sec long 119 page docx [was 185sec for previous commenter] Anyone like to test with ApacheOpenOffice 4.x please? And also on msftOffice, please specify Word version and Windows version? See also this bug, which is a much slower docx (~400 seconds) and has plenty of footnotes to go around. https://bugs.freedesktop.org/show_bug.cgi?id=62656
just tested the shorter 50 page .docx on Win7 64bit. same bad performance (almost 2 minutes to load) in LibO 3.3.3, LibO 4.1.2 and in OOo 3.3.0 and AOO 4.0.0. so it seems the bug is inherited from OOo. changing
> just tested the shorter 50 page .docx on Win7 64bit ... > so it seems the bug is inherited from OOo. changing @Tommy, can you please test the 1.5mb "fast" document? Allegedly it worked fine in 2011, but on my Lio357 from 2012 is is now five times slower than before. Might be a temporarily regression in the 3.5.x series, that is now fine in the 4.x series?
I opened in LibO 4.1.4.2 on XP/SP3: - 50 page file (123kB): in 46.9 sec; bug report: 2 min - 119 page file (312kB): in 147 sec, bug report: 6 min - 17 page file with 1.5 MB: in 4 sec; the bug reports gives no indication on opening time See also the AskLibO Q&A: http://ask.libreoffice.org/en/question/29282/openbenchmarking-results-of-libreoffice/?answer=29290#post-id-29290
Julien your callgrind trace is 404 now (urgh) =) in general running this stuff in callgrind (with symbols) and then looking at the results carefully, adding printf's to the slow pieces and then looking at what is churned on the console is the quickest and best way to identify what silly stuff is going on =) That's really something that users can help with ... annoying that we can't attach those traces to bugzilla though ...
Confirmed:4.2.0.4:OSX CPU spikes and 1.42min to open 50 page docx. Something is very off here.
a quick check on loading the attachment implicated in comment #28 gives: 1167072 112520 pts/15 Sl+ 22:20 0:01 OOo_3.2.1/opt/openoffice.org3/program/soffice.bin 1202460 120096 pts/6 Sl+ 22:20 0:01 OOo_3.3/opt/openoffice.org3/program/soffice.bin 1209924 122832 pts/14 Sl+ 22:20 0:01 OOo_3.4_beta/opt/ooo-dev3/program/soffice.bin 953528 131192 pts/9 Sl+ 22:20 0:01 LO_3.4.6/opt/libreoffice3.4/program/soffice.bin 1075360 121616 pts/10 Sl+ 22:21 0:01 LO_3.5.7/opt/libreoffice3.5/program/soffice.bin 996916 124556 pts/11 Sl+ 22:21 0:01 LO_3.6.7/opt/libreoffice3.6/program/soffice.bin 1594144 143480 pts/12 Sl+ 22:21 0:02 LO_4.0.6/opt/libreoffice4.0/program/soffice.bin 1581200 135336 pts/13 Sl+ 22:22 0:02 LO_4.1.4/opt/libreoffice4.1/program/soffice.bin --splash-pipe=5 a look at the CPU time column shows only 0:01/0:02 values across all releases => removing "regression" if you do find a document for which loading performance has regressed, _please_ open a separate bug, do not add "regression" here!
Comment on attachment 49033 [details] Large 1.5mb docx file since this document opens quickly it is sort of off-topic for this bug
(In reply to Astrix from comment #0) > Created attachment 49025 [details] > shorter 50 page docx document still takes 1 minute and 40 seconds to load this file with LibO 4.3.4.1 and recent 4.5.0.0 alpha tested under Win8.1 x64 AMD A8-6410 APU 2GHz 8GB RAM SSD
Might be worth building a new profile of a build with debug symbols on vs. master - a ton of stuff changed in the XFastParser recently - and that has a very helpful effect wrt. splitting the core implementation slowness in writer from the XML parsing - and making it more visible. Any takers ?
(In reply to tommy27 from comment #34) > (In reply to Astrix from comment #0) > > Created attachment 49025 [details] > > shorter 50 page docx document > > still takes 1 minute and 40 seconds to load this file with LibO 4.3.4.1 and > recent 4.5.0.0 alpha > > tested under Win8.1 x64 > AMD A8-6410 APU 2GHz 8GB RAM SSD retested in the same computer and I see an improvement. both LibO 4.4.3.2 and recent 5.1.0.0 alpha take 1 minute and 21 seconds
Please test again with current version. For me with LO 5.1.4.0.0+ built at home under Ubuntu 16.04 x86-64: short 50 pages docx: 18 seconds long 119 pages docx: 55 seconds Best regards. JBF
I put a new callgrind trace here: http://serval2412.free.fr/tdf_39179_callgrind.out.7206.bz2 I retrieved it with master sources updated yesterday (enable-dbgutil)
Hi Julien. Thanks so much for running that; but I guess: > I retrieved it with master sources updated yesterday (enable-dbgutil) dbgutil adds -so- much more debugging code and changes all the code-paths - such that the profile is no longer useful (normally) - although with some quick kcachegrinding perhaps we'll find that it doesn't distort it too much; I didn't look. Ideally an --enable-symbols build is the one to callgrind. Thanks !
Created attachment 125095 [details] kcachegrind screenshot I use dbg-util to try to retrieve the maximum of information (in bt and/or in console). I'm not accustomed to interpreting kcachegrind views but it seems these might be interesting: 1) http://opengrok.libreoffice.org/xref/core/sax/source/tools/fastattribs.cxx#121 121 sal_Bool FastAttributeList::hasAttribute( ::sal_Int32 Token ) throw (RuntimeException, std::exception) 122 { 123 for (sal_Int32 i : maAttributeTokens) 124 if (i == Token) 125 return true; 126 127 return false; 128 } I wonder if the loop could be replaced by something else. 2) http://opengrok.libreoffice.org/xref/core/sax/source/fastparser/fastparser.cxx#639 FastSaxParserImpl::GetTokenWithPrefix
Interesting: so this: 123 for (sal_Int32 i : maAttributeTokens) 124 if (i == Token) 125 return true; is I'm afraid reasonably optimal - you can use a hash which is supposedly constant time, but cache effects will probably hurt you. In general - I'm surprised to see that at the top of the profile; are there far too many attributes being used in the XML ? is it possible that the XML is parsed again and again in error (?) - or some meta-data ? The call-counts can be helpful for that as you look up the stack in kcachegrind - and those are ~independent of dbgutil.
Trying to find the different paths, I noticed this part 5240 void DomainMapper_Impl::substream(Id rName, 5241 ::writerfilter::Reference<Stream>::Pointer_t const& ref) 5242 { 5243 #ifndef NDEBUG 5244 size_t contextSize(m_aContextStack.size()); 5245 size_t propSize[NUMBER_OF_CONTEXTS]; 5246 for (int i = 0; i < NUMBER_OF_CONTEXTS; ++i) { 5247 propSize[i] = m_aPropertyStacks[i].size(); 5248 } (see http://opengrok.libreoffice.org/xref/core/writerfilter/source/dmapper/DomainMapper_Impl.cxx#5240) for use here: 5324 // check that stacks are the same as before substream 5325 assert(m_aContextStack.size() == contextSize); 5326 for (int i = 0; i < NUMBER_OF_CONTEXTS; ++i) { 5327 assert(m_aPropertyStacks[i].size() == propSize[i]); 5328 } But if it allows to check that everything is ok in debug mode, I suppose this part must be ok.
I tested with LO Debian package 5.1.3.2, I confirm what Jean-Baptiste indicated, it takes less than 20s to load the 50 pages version. Also, when I kept on going up the calling tree, I found this call: void LoggedStream::substream(Id name, writerfilter::Reference<Stream>::Pointer_t ref) from writerfilter/source/dmapper/LoggedResources.cxx 276 void LoggedStream::substream(Id name, writerfilter::Reference<Stream>::Pointer_t ref) 277 { 278 #ifdef DEBUG_WRITERFILTER 279 mHelper.startElement("substream"); 280 LoggedResourcesHelper::attribute("name", (*QNameToString::Instance())(name)); 281 #endif 282 283 lcl_substream(name, ref); 284 285 #ifdef DEBUG_WRITERFILTER 286 LoggedResourcesHelper::endElement("substream"); 287 #endif 288 } So it seems that debug build takes lots of time because of these DEBUG_WRITERFILTER (but it's normal since it's for debugging :-)). But these impact only debug build so for non debug release, it seems indeed ok now. May we put this one to WFM then?
Sure - performance measurements on dbgutil builds are -totally- unhelpful =) only a product build should ever be used to measure or assess performance; so I'm happy to close. But was the original reporter using a product or a dbgutil build ? =)
Still a repro. However not at the same extend as the initial report (+/-60 seconds) Version: 5.5.0.0.alpha0+ Build ID: ec79f3453471ee9b6ae32e71ff16ea99d9b7751c CPU threads: 4; OS: Windows 6.19; UI render: default; TinderBox: Win-x86@42, Branch:master, Time: 2017-05-28_23:21:44 Locale: nl-NL (nl_NL); Calc: CL
(In reply to Telesto from comment #45) > Still a repro. However not at the same extend as the initial report (+/-60 > seconds) > Version: 5.5.0.0.alpha0+ > Build ID: ec79f3453471ee9b6ae32e71ff16ea99d9b7751c > CPU threads: 4; OS: Windows 6.19; UI render: default; > TinderBox: Win-x86@42, Branch:master, Time: 2017-05-28_23:21:44 > Locale: nl-NL (nl_NL); Calc: CL 20 seconds here Version: 6.0.0.0.alpha0+ Build ID: 6e3e4cd38b56d432c48cd7217885974e3f0519fd CPU threads: 4; OS: Linux 4.10; UI render: default; VCL: gtk3; Locale: ca-ES (ca_ES.UTF-8); Calc: group Could you please try again with the same computer ?
(In reply to Xisco Faulí from comment #46) > 20 seconds here > > Version: 6.0.0.0.alpha0+ > Build ID: 6e3e4cd38b56d432c48cd7217885974e3f0519fd > CPU threads: 4; OS: Linux 4.10; UI render: default; VCL: gtk3; > Locale: ca-ES (ca_ES.UTF-8); Calc: group > > Could you please try again with the same computer ? 22 sec in LO 6.1.0.1 on Linux (Xubuntu 16.10) and I have 49 pages in the opened document @Xisco: WFM?
Created attachment 145116 [details] Callgrind output from master (fileopen 50 page docx) Under normal conditions: 50 page docx takes 13 secs to open 119 page docx takes 32 secs to open Arch Linux 64-bit Version: 6.2.0.0.alpha0+ Build ID: 0ffa7a733d834647dfd59b864c52a015028822b6 CPU threads: 8; OS: Linux 4.18; UI render: default; VCL: gtk3_kde5; Locale: fi-FI (fi_FI.UTF-8); Calc: threaded Built on September 21st 2018
50 pages document takes real 0m23,714s user 0m23,225s sys 0m0,359s in Version: 6.3.0.0.alpha0+ Build ID: 3b518953a8141b0d5043c2f3996a92956fdc3a47 CPU threads: 4; OS: Linux 4.15; UI render: default; VCL: gtk3; Locale: ca-ES (ca_ES.UTF-8); UI-Language: en-US Calc: threaded while the 119 pages document takes real 1m7,827s user 1m7,196s sys 0m0,529s could be improved but it's not that bad
Created attachment 152517 [details] Flamegraph On pc Debian x86-64 with master sources updated + enable-symbols (not enable-dbgutil), I retrieved a Flamegraph during opening of 50 pages doc.
The entire footnote.xml file is opened and read for every single footnote reference. A huge speed-up could be attained by parsing once and storing the result in an array (at the cost of theoretically infinite memory). But I'm an inadequate programmer to handle that task... Some useful commentary from commit 330b860205c7ba69dd6603f65324d0f89ad9cd5f Author: Miklos Vajna Date: Wed Sep 4 16:08:49 2013 +0200 tdf#68787 DOCX import: handle when w:separator is missing for footnotes ... 2) The footnote stream isn't read by the importer, only when the main stream references the footnote one, the relevant part of it is parsed. At the moment we always parse the first (special, "separator") entry in the footnote stream, that may be optimized later if it becomes a bottleneck. That optimization of only parsing the separator once instead of 973 times yields only a minor improvement (<10%). gerrit.libreoffice.org/84104
bug 76260 has much better commentary and is definitely talking about the same problem. Marking this one as the duplicate. *** This bug has been marked as a duplicate of bug 76260 ***