Description: FILEOPEN DOCX: Slow opening of document containing a 222 pages table Steps to Reproduce: 1. Open attachment 164209 [details] bug 135584 2. Take notice of the time until the page counter reaches 222 pages/ cpu drops 3. Save file as ODT 4. File reload and measure time until CPU drops Note: disable the automatic spell checker Actual Results: Opening of the ODT is by faster and smoother compared to DOCX Expected Results: Maybe some tweaking can be done Reproducible: Always User Profile Reset: No Additional Info: Found in Version: 7.1.0.0.alpha0+ (x64) Build ID: <buildversion> CPU threads: 4; OS: Windows 6.3 Build 9600; UI render: Skia/Raster; VCL: win Locale: nl-NL (nl_NL); UI: nl-NL Calc: CL and in 3.3.0
I confirm it with Version: 7.0.3.1 (x64) Build ID: d7547858d014d4cf69878db179d326fc3483e082 CPU threads: 4; OS: Windows 10.0 Build 19042; UI render: Skia/Raster; VCL: win Locale: de-DE (de_DE); UI: en-GB Calc: threaded around 70 seconds with docx-file around 12 seconds with odt-file
*** Bug 136227 has been marked as a duplicate of this bug. ***
*** Bug 136748 has been marked as a duplicate of this bug. ***
With current 7.2 bibisect on my old-ish machine, when measured with time OOO_EXIT_POST_STARTUP=1 isw ../test_file_tables.docx I get about 15-18 seconds values. But when I measure the time until the page count reaches 222, that takes about 43-45 seconds measured on my phone. So after finishing the XML processing it seems like the rendering takes another 30 seconds. The file has a huge table and some 180 tracked changes according to Word. If I accept all changes and open that version in Writer, then the page counter reaches 222 in ~35 seconds instead of ~45.
Created attachment 174588 [details] Perf flamegraph It's not very slow for me, about 10 secs to word count, but here is a flamegraph Version: 7.3.0.0.alpha0+ / LibreOffice Community Build ID: 58a5bd793a2ed57077fc598281cc74e16373b877 CPU threads: 8; OS: Linux 5.13; UI render: default; VCL: kf5 (cairo+xcb) Locale: fi-FI (fi_FI.UTF-8); UI: en-US Calc: threaded
Noel Grandin committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/69e0567e118f00f299b6aac645c249521eb0629f tdf#135683 speed up layout of large writer tables It will be available in 7.3.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Noel Grandin committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/e3ea0e32657a41b48d9d9d28f6ad15af4c2a7abc tdf#135683 speed up large writer table load It will be available in 7.3.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Noel Grandin committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/bb5425ed3d8cc04e4242059a17912752d6b48c53 tdf#135683 speed up writer layout cache access It will be available in 7.3.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
As explained, fileopen is not slow, but full loading is...somewhat, worse from this DOCX is bug 144373 for ODT. Track changes have influence..but worse is bug 144208. All that without measuring before and after the fix, just 7.3+. I don't see point in so many different reports for general problem of table perf in Writer. And I think it's a long time known issue - maybe a meta bug should collect all those.
Noel Grandin committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/d467cd0dd9e9cf3b018859a592e2638527bc7add tdf#135683 speedup DocumentRedlineManager::GetRedlinePos It will be available in 7.3.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
It takes around 20 sec from start of the file opening to end of all 222 pages loading and CPU usage decreasing in Version: 7.3.0.0.alpha0+ (x64) / LibreOffice Community Build ID: c5aef25352d20e052ec3a697f3cb979d3bbf9df6 CPU threads: 4; OS: Windows 10.0 Build 19043; UI render: default; VCL: win Locale: ru-RU (ru_RU); UI: en-US Calc: threaded
Noel Grandin committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/b62153753a9f21afb2a49110ef0459e427b0b01a tdf#135683 speedup SwAttrHandler It will be available in 7.3.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Noting that one reason this is still slow is because this document has invalid (overlapping) redlines, which forces the code to fall back to a slower search algorithm
Noel, please explain how one can recognize those invalid (overlapping) redlines, so that we know in testing.
(In reply to Timur from comment #14) > Noel, please explain how one can recognize those invalid (overlapping) > redlines, so that we know in testing. Likely the same as for bug 144995; See also https://gerrit.libreoffice.org/c/core/+/123458
(In reply to Commit Notification from comment #7) > Noel Grandin committed a patch related to this issue. > It has been pushed to "master": > > https://git.libreoffice.org/core/commit/ > e3ea0e32657a41b48d9d9d28f6ad15af4c2a7abc > > tdf#135683 speed up large writer table load For the record, this has been reverted due to bug 144840.
(In reply to Aron Budea from comment #16) > > tdf#135683 speed up large writer table load > For the record, this has been reverted due to bug 144840. A bit of a philosophical question (or me lacking information) I'm ask myself, is the optimization fundamentally wrong or is it simply uncovering some weird logic? I surely understand that the person who is working on optimizations is working on a 'high' level. Not interest/aware of all the implementation aspect of everything involved. And lacking the interest to solve what he/she has broken somewhere else in the code. And the first reflex being; lets revert. I'm not going the solve the specific problem (and maybe there are more?) However it feels like throwing away the child with the bathwater. If someone bails on the first encounter of problem (headwind). It's bad for progress, IMHO So I'm asking myself is there an assessment made why the the problem occurs? (or plainly opted for revert; the easy course of action). There are already so many unit test etc. So I assume the optimization being pretty on first sight (and the bug being the exception) The assessment shouldn't necessary be made by the one pushing the commit. Obviously it's better to be handled by someone with some more code knowledge in the area involved [something called collaboration]. I know availability of developers is scarce commodity.. and this might be seen as throwing stuff over the fence (in bad faith) I do notice that mostly developer are left on their own; getting the fall-out on their plate, if though the can't really help it (broken code somewhere else, but unfamiliar with it; so no intention to solve). And nobody interested in more/different (or yet unknown) bugs. Another issue is that with pulling the commit to soon, is the lack of data.. You don't get enough feedback if there are more problems or only one. I surely understand pulling a commit with 3-5 bugs reported against it which show problem in different parts of the code. But bailing out to soon makes progress really hard. But sometimes ask myself is the current practice efficient/effective? Is there no better way to handle this?
Thank you Noel for your work on this, but I'm wondering if this should really be included in the 7.3 release notes? I haven't noticed a particularly significant improvement between 7.2 and 7.3 for this particular document: 58 seconds and 53 seconds respectively until the number of pages shows as 222. Still a fair way away from the couple of seconds needed to open the same file saved as ODT. Version: 7.2.4.1 / LibreOffice Community Build ID: 27d75539669ac387bb498e35313b970b7fe9c4f9 CPU threads: 8; OS: Linux 5.4; UI render: default; VCL: gtk3 Locale: en-AU (en_AU.UTF-8); UI: en-US Calc: threaded Version: 7.3.0.1 / LibreOffice Community Build ID: 840fe2f57ae5ad80d62bfa6e25550cb10ddabd1d CPU threads: 8; OS: Linux 5.4; UI render: default; VCL: gtk3 Locale: en-AU (en_AU.UTF-8); UI: en-US Calc: threaded
(In reply to stragu from comment #18) > Thank you Noel for your work on this, but I'm wondering if this should > really be included in the 7.3 release notes? I don't think this should be mentioned. The core fix is lacking after. See comment 16
*** Bug 148936 has been marked as a duplicate of this bug. ***