Created attachment 109115 [details] Spreadsheet demonstrating good save performance I have a spreadsheet which exhibits very different performance opening depending upon whether a certain data series is present in a chart or not. The two attached spreadsheets demonstrate the issue. They are identical except that one has the Row 2021 data series in the chart on the 5th page. The version that does not have that row takes 22 seconds of CPU time to save (Core i7-920XM). The version that does takes 163 seconds to save. In addition, when I exit editing the chart (by clicking outside of the chart area), the version with the row 2021 data series takes several minutes before returning to the spreadsheet (didn't measure it). The version without that data series takes maybe 10 seconds or so.
Created attachment 109116 [details] Spreadsheet demonstrating poor save performance
Note that the Row 2021 data set is the only one that has the data series in a row; the other data series are all columnar (for what it may be worth).
Ubuntu 14.04 LibreOffice 4.3.3.2 release @Rik - thanks for reporting! In the future please always include your operating system and version of libreoffice in your report :) So this one is quite interesting/complex - findings: Open time is essentially the same on both. Save time is dramatically different - I gave it 3+ minutes and finally quit the program. That being said, the spreadsheet is so complex that it is REALLY hard to figure out what is causing the problem. If a better and simpler test case could be provided, that would be fantastic. Also tested on 3.3 (inherited from OOo) Opening time is substantially slower in 3.3 Save time: (1) for first sheet - save time was 35 seconds; (2) for second sheet - save time was 106 seconds. Because it seems like it's gotten slower - I am confirming this as NEW. That being said, it might just be a result of an incredibly complex chart which takes time to save....and the test document is not terribly useful for determining what is going on because it's so complex. A bibisect might help (although will take some time) so asking for one to explicitly address the slowing down of save time for the second sheet. Setting as: New.
(rlk, not rik) I did include the OS and version, 4.1.6.2 on x64/Linux. Just not in the comments. I know that spreadsheet is hideously complicated. Believe it or not, it's only a fraction of the real thing (it's the full data set, but only a small number of pages). I'll see what I can do. The interesting thing is that the only difference between the two spreadsheets is that the slow one has one extra data set in the graph, and that extra data set has only about half a dozen points (the others have hundreds up to a few thousand). But the obvious difference is that the extra data set is in a pair of rows rather than columns.
Created attachment 109121 [details] Simpler case demonstrating the problem OK, here's a simpler case. Same spreadsheet, without the formulas and formats (everything is converted to its numerical representation). Saves in 131 seconds. Remove that "row 2021" data range and it saves in 3 seconds.
So awesome - thanks!
so just a quick question, row 2021 of which sheet? I just saved and it definitely was more than 3 seconds but maybe that's just because my system is slower than yours
Created attachment 109122 [details] Even simpler example This one removes the irrelevant columns and data. With the row 2021 dataset, it takes 100 seconds to save; without, it takes 2 seconds. The issue is not whether it's a regression or not; it's that adding an additional tiny data range to a chart drastically inflates save time.
row 2021 -- look in the chart; it's row 2021 of the Splits sheet (row 2019 is the X axis). You'll need to scroll over to the right (or ctrl+end) to see the data in that row.
I think the issue is that you are mixing data in columns with data in rows. All are column data except Row2021 just the source of the issue. Even it works I think it's not by design. In fact wizard or range editor ask for data in columns or rows. For me not a bug. But I hope someone with acknowledgement about charts can bring some light.
I do suspect that that mixing is at least part of the problem, but it's a problem nonetheless. If it's not supposed to work at all it should be documented and blocked, but it does appear to be allowed and it's far from clear why it should lead to such a large performance problem.
Working the the bibisect-43all repository, I see from `git bisect bad`: ea5723ff672d4a3ffb9ee123a96dda4681703a50 is the first bad commit commit ea5723ff672d4a3ffb9ee123a96dda4681703a50 Author: Bjoern Michaelsen <bjoern.michaelsen@canonical.com> Date: Wed May 21 08:57:55 2014 +0000 source-hash-1b8cfd5d1414da9127ff18e4701b40f5bceabb36 commit 1b8cfd5d1414da9127ff18e4701b40f5bceabb36 Author: Markus Mohrhard <markus.mohrhard@collabora.co.uk> AuthorDate: Tue May 20 23:59:55 2014 +0200 Commit: Markus Mohrhard <markus.mohrhard@collabora.co.uk> CommitDate: Wed May 21 01:50:44 2014 +0200 remove unnecessary whitespaces Change-Id: I4d93c0149aaf90e2477c1382aa51f8f08c967626 :100644 100644 501fbfa427b8a231fe035cd8c6c9861d0483e887 03ad3b133ed354d5ab45bdc712a5355013402c17 M ccache.log :100644 100644 94e1769dda5ee27e4952754e43c4d26fe19c2009 e900cb9aba87ebf78801bd60b03866aab0460fac M commitmsg :100644 100644 8c966b034e7c757c8922b3ba931bd93077f23e64 b8dc772fd79ceee78f8854fb2bdf577904e75049 M make.log :040000 040000 5f3414102b71bcb5a2c7f817d6502959169c4345 291f8ca8cacce30d162bc70628131b1a960519bb M opt and from `git bisect log`: # bad: [423a84c4f7068853974887d98442bc2a2d0cc91b] source-hash-c15927f20d4727c3b8de68497b6949e72f9e6e9e # good: [65fd30f5cb4cdd37995a33420ed8273c0a29bf00] source-hash-d6cde02dbce8c28c6af836e2dc1120f8a6ef9932 git bisect start 'latest' 'oldest' # good: [e02439a3d6297a1f5334fa558ddec5ef4212c574] source-hash-6b8393474974d2af7a2cb3c47b3d5c081b550bdb git bisect good e02439a3d6297a1f5334fa558ddec5ef4212c574 # good: [4850941efe43ae800be5c76e1102ab80ac2c085d] source-hash-980a6e552502f02f12c15bfb1c9f8e6269499f4b git bisect good 4850941efe43ae800be5c76e1102ab80ac2c085d # skip: [a900e72b6357882284c5955bdf939bf14269f5fb] source-hash-dd1050b182260a26a1d0ba6d0ef3a6fecc3f4e07 git bisect skip a900e72b6357882284c5955bdf939bf14269f5fb # skip: [3dda83fc3a43afc6af7f5c0ffd029e610ec1b9a3] source-hash-c59b3d6c5c8096486730007d9b9b053793b90b1e git bisect skip 3dda83fc3a43afc6af7f5c0ffd029e610ec1b9a3 # good: [4f705a8cfb1998b09f2062510b207d35a33647d8] source-hash-1eeb20f3958666ec6ba6e0fcf52e92e5eb447a14 git bisect good 4f705a8cfb1998b09f2062510b207d35a33647d8 # good: [465e2be02951f9645beb3024506a5212907caf5f] source-hash-674801eb4af21c9ae83c122499f15fa4f4785b0f git bisect good 465e2be02951f9645beb3024506a5212907caf5f # good: [1de992ed60dad1fd65ff0383a8610b0468d9d772] source-hash-8bf0b9536cb33dfcce8a811b70c2ead285300f3f git bisect good 1de992ed60dad1fd65ff0383a8610b0468d9d772 # good: [882db5e268e28962bdf805c820a5e031b0df9936] source-hash-383dccc094f8c8c07b4298ce0b7406d18cd61cee git bisect good 882db5e268e28962bdf805c820a5e031b0df9936 # good: [69e91b0c890097cfb1781733f260644bb1ec9ee8] source-hash-66fce1f61e7e088bd92e19ecb0dd94553de6f824 git bisect good 69e91b0c890097cfb1781733f260644bb1ec9ee8 # good: [426fdcf0b13134ffbc7f92900d5862fad3723192] source-hash-c9f3c508bb1a1d94fd6172b9cdac30278559f31c git bisect good 426fdcf0b13134ffbc7f92900d5862fad3723192 # bad: [9f0fd4c327fa8765bef8a0e2a0e85a620ab6f835] source-hash-fe1e1d1292973211f0edfa64c05279ca59634652 git bisect bad 9f0fd4c327fa8765bef8a0e2a0e85a620ab6f835 # bad: [ea5723ff672d4a3ffb9ee123a96dda4681703a50] source-hash-1b8cfd5d1414da9127ff18e4701b40f5bceabb36 git bisect bad ea5723ff672d4a3ffb9ee123a96dda4681703a50 # first bad commit: [ea5723ff672d4a3ffb9ee123a96dda4681703a50] source-hash-1b8cfd5d1414da9127ff18e4701b40f5bceabb36
This seems to have been caused by commit 2fc4f5cd750cc3899ffc6b2f831edcb01e7773b4, though it's not clear from the commit message why that would be. Before the commit, saving attachment 109122 [details] is ~instant, while after it takes minutes (* it was necessary to cherry pick the fix from commit 498c314861f0913a5b31ee29efc38aad12c3a781 to avoid a crash opening the file) Adding Cc: to sbergman@redhat.com; Any idea what relevance this commit could have to a massive performance regression on save? Thanks commit 2fc4f5cd750cc3899ffc6b2f831edcb01e7773b4 Author: Kohei Yoshida <kohei.yoshida@collabora.com> Date: Mon May 19 16:00:39 2014 -0400 Register OpenGL windows to ScGridWindow upon file load. Change-Id: I6fdce232ef41079f20a983dc72da032196078858 commit 498c314861f0913a5b31ee29efc38aad12c3a781 Author: Stephan Bergmann <sbergman@redhat.com> Date: Tue May 20 09:41:59 2014 +0200 Hack to for now work around dpy == null crashing CppunitTest_chart2_* ...since 2fc4f5cd750cc3899ffc6b2f831edcb01e7773b4 "Register OpenGL windows to ScGridWindow upon file load." Should be revisited and fixed properly. Change-Id: I5a3d08cb750fd9ec7c98d5b85250eaabd1c7a8b2
(In reply to Matthew Francis from comment #13) > This seems to have been caused by commit > 2fc4f5cd750cc3899ffc6b2f831edcb01e7773b4, though it's not clear from the > commit message why that would be. Before the commit, saving attachment > 109122 [details] is ~instant, while after it takes minutes > (* it was necessary to cherry pick the fix from commit > 498c314861f0913a5b31ee29efc38aad12c3a781 to avoid a crash opening the file) > > Adding Cc: to sbergman@redhat.com; Any idea what relevance this commit could > have to a massive performance regression on save? Thanks No idea at all about 2fc4f5cd750cc3899ffc6b2f831edcb01e7773b4, and no idea whether 498c314861f0913a5b31ee29efc38aad12c3a781 has any (additional) performance impact. The latter commit was a completely uneducated attempt at making "make check" work again, without much insight into the former commit's content; it looks like I pinged Kohei and Moggi (now on cc) about the latter commit's "Should be revisited and fixed properly," back then, but don't remember anything further.
Migrating Whiteboard tags to Keywords: (bibisected) [NinjaEdit]
Adding Cc: to Kohei Yoshida
Dennis Francis committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=b2cb3196868deb499bb91fc9bdbe6c8f7d1e9033 tdf#86019: Use ChartDataRowSource_COLUMNS as the chart... It will be available in 6.0.0. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
I am sorry to report that my results are disappointing. Working on debian-stretch with the daily Linux dbgutil bibisect repository, I see CPU times to open attachments rowing-time-r-{7,10}.ods: LO ...7... ...10... ------------------ ------- --------- dbgutil 2017-08-16 35.66 14.14 dbgutil 2017-08-18 37.62 14.27
(In reply to Terrence Enger from comment #18) > I am sorry to report that my results are disappointing. > > Working on debian-stretch with the daily Linux dbgutil bibisect > repository, I see CPU times to open attachments > rowing-time-r-{7,10}.ods: > > LO ...7... ...10... > ------------------ ------- --------- > dbgutil 2017-08-16 35.66 14.14 > dbgutil 2017-08-18 37.62 14.27 Don't ever report performance results with a dbgutil or a debug build. They are absolutely useless. Either a release build or a symbols build.
(In reply to Terrence Enger from comment #18) > I am sorry to report that my results are disappointing. > > Working on debian-stretch with the daily Linux dbgutil bibisect > repository, I see CPU times to open attachments > rowing-time-r-{7,10}.ods: > > LO ...7... ...10... > ------------------ ------- --------- > dbgutil 2017-08-16 35.66 14.14 > dbgutil 2017-08-18 37.62 14.27 Could you check the *save* performances of these documents in a release build as that is the main problem stated in the bug report content/comments/bibisection as opposed to what the title says. In my experiments with the last patch both "good" save perf document and "poor" save perf document have similar *save* performances in terms of running time. Thanks.
opening the spreadsheet poor save performance takes 19 seconds and saving takes 40 seconds Version: 5.4.2.0.0+ Build ID: ddba061b07e55dfd2c91320ca790ff88618b132c CPU threads: 8; OS: Linux 4.9; UI render: default; VCL: kde4; TinderBox: Linux-rpm_deb-x86_64@70-TDF, Branch:libreoffice-5-4, Time: 2017-08-17_01:01:48 Locale: nl-BE (en_US.UTF-8); Calc: group
(In reply to Xavier Van Wijmeersch from comment #21) > opening the spreadsheet poor save performance takes 19 seconds > and saving takes 40 seconds > > Version: 5.4.2.0.0+ > Build ID: ddba061b07e55dfd2c91320ca790ff88618b132c The commit is only in the 6.0 (master) branch for now.
Dennis Francis committed a patch related to this issue. It has been pushed to "libreoffice-5-4": http://cgit.freedesktop.org/libreoffice/core/commit/?id=afd5b4b7eda6171817f5fa1a426d4c11a1a734c9&h=libreoffice-5-4 tdf#86019: Use ChartDataRowSource_COLUMNS as the chart... It will be available in 5.4.2. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
A polite ping to Dennis Francis: is this bug fixed? if so, please close it as RESOLVED FIXED