Steps to reproduce: 1. Open attachment 125390 [details] from bug 100139 using 'time OOO_EXIT_POST_STARTUP=1 instdir/program/soffice' it takes real 3m58,908s user 3m57,239s sys 0m1,646s in Version: 7.0.0.0.alpha0+ Build ID: fd1cd5522283f279a01d6d673f676a1346e9358b CPU threads: 4; OS: Linux 4.19; UI render: default; VCL: gtk3; Locale: en-US (en_US.UTF-8); UI-Language: en-US Calc: threaded and real 1m38,501s user 1m37,409s sys 0m1,048s in Version: 6.4.0.0.alpha1+ Build ID: 9bc848cf0d301aa57eabcffa101a1cf87bad6470 CPU threads: 4; OS: Linux 4.19; UI render: default; VCL: gtk3; Locale: en-US (en_US.UTF-8); UI-Language: en-US Calc: threaded
perf framegraph submitted by Julien in attachment 158953 [details] from bug 100139
The opening time went from real 1m38,501s user 1m37,409s sys 0m1,048s to real 2m34,595s user 2m33,596s sys 0m1,419s after https://cgit.freedesktop.org/libreoffice/core/commit/?id=2ab481b038b62b1ff576ac4d49d03c1798cd7f84 author László Németh <nemeth@numbertext.org> 2020-01-08 14:26:40 +0100 committer László Németh <nemeth@numbertext.org> 2020-01-09 18:00:16 +0100 commit 2ab481b038b62b1ff576ac4d49d03c1798cd7f84 (patch) tree 9739e3b799bd06ba07d8cca7ad6c8b85de75dda8 parent 79084665f0e351a3f83fdee88071919f05ec9cc3 (diff) tdf#90069 DOCX: fix character style of new table rows later on, it went from real 2m34,595s user 2m33,596s sys 0m1,419s to real 3m58,908s user 3m57,239s sys 0m1,646s after author László Németh <nemeth@numbertext.org> 2020-02-17 14:34:11 +0100 committer László Németh <nemeth@numbertext.org> 2020-02-19 16:46:18 +0100 commit 4d5c0eaf3e0d3d3bcd9e691fffee19b75f3d6631 (patch) tree 6ed8e4a013884c28db01b9175dfc933141b7c395 parent faa2e7b7227b6b87379e7e136ea9ab63f37c3fc4 (diff) tdf#118812 DOCX import: fix table style preference – part 2 Adding Cc: to László Németh Bisected with bibisect-linux64-6.5
According to measurements made by our intern (thx Balázs Sántha!) there are a few problematic areas around large docx tables: - in 6.3 and before opening performance of large docx tables such as this attachment 125390 [details] was somewhat slow: for this file a not great, not terrible ~40-45 seconds on my laptop. We can say bug 93660 is about this part of the problem. - then it became slower in 6.4 to 1:10 minute - as bibisected in bug 136227 comment 3 - then it became even more slower in 7.0 as bibisected here; slowed to around 2:45-2:50 minutes. Other similar docx+large table bugs are: bug 76385 (nested tables load fast now, but seems to leak memory on Linux, yet: not on Windows) bug 100139 (tracked changes made it slow to edit, not anymore) bug 101149 (docx load became - interestingly! - better, doc load is still bad, also rendering feels a bit janky: it renders some pages, stops, renders again, stops...) bug 135683 (nothing special, duplicate of this one)
For https://cgit.freedesktop.org/libreoffice/core/commit/?id=2ab481b038b62b1ff576ac4d49d03c1798cd7f84, it would be enough to apply the original patch only for the last table row, so likely there is an “easy” fix for the regression.
Balazs Santha committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/498d2b82187ec3ff58f076e0d15741e64c0505ba tdf#131546 DOCX import: fix performance regression at tables It will be available in 7.3.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
tdf#131546 DOCX import: fix performance regression at tables Commit 2ab481b038b62b1ff576ac4d49d03c1798cd7f84 "tdf#90069 DOCX: fix character style of new table rows" caused ~20% slowing down in loading time of documents with huge tables, related to the extra processing of the redundant w:rPr of table paragraph runs. (In DOCX tables, MSO exports the run properties into the run and paragraph sections too, probably because of compatibility or usability reasons.) Theoretically in this case, the run properties which are under the run section win. On the other hand, because LO copies the props which are applied on paragraph level, and only them, when copying a row (e.g. upon inserting a new one), it was needed to apply the mentioned run props not only as direct character formatting, but as a direct paragraph formatting too. This way, the support of copying of rows are solved. Unfortunately, this "double" applying was done for every single paragraph, which quite slowed down the opening time. This patch gives a workaround, which completely removes this double applying functionality in the writerfilter by reverting commit 2ab481b038b62b1ff576ac4d49d03c1798cd7f84 (except its unit test), and copy the mentioned run properties into paragraph level, when its needed: upon inserting a new row before/after. This way we spare a lot of cycles, as most of the original applies had no real use whatsoever.