Bug 92802 - Large but simple ODS file slow on FILEOPEN
Summary: Large but simple ODS file slow on FILEOPEN
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
4.2.0.4 release
Hardware: x86-64 (AMD64) All
: high normal
Assignee: Markus Mohrhard
URL:
Whiteboard: target:5.2.0
Keywords: bibisected, bisected, filter:ods, perf, regression
Depends on:
Blocks:
 
Reported: 2015-07-17 13:02 UTC by Milos Sramek
Modified: 2021-09-13 12:49 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
file used in tests (5.92 MB, application/vnd.oasis.opendocument.spreadsheet)
2015-07-17 13:02 UTC, Milos Sramek
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Milos Sramek 2015-07-17 13:02:35 UTC
Created attachment 117292 [details]
file used in tests

Loading and saving of large but simple (160000 lines, 7 columns, no equations) ods files has been since version 4.2.0.0 significantly slower.

version       load  save
3.6.7.2       20s    5s
4.1.6         14s    5s
4.2.0.0.beta1 33s    1:26 min
4.4.0.1       40s    3:24 min
5.0.0.3-3     45s    2:30 min


--
Milos
Comment 1 MM 2015-07-18 12:46:53 UTC
Tested with v4.1.6.2 / v4.2.6.3 under mint 16 and v5.0.0.3 under mint 17.2 x64.
And with v4.4.5.1 under windows 7 x64.

With 4.1 opening / saving is so much faster than any newer versions.

Set to ALL, regression.
Comment 2 Milos Sramek 2015-07-25 18:19:26 UTC
Hi,

you can find the bibisection results below. Bibisection has shown that slow
loading and slow saving are in fact two different problems. For example, the first
commit with slow loading d42e14f46fbe1bfc716dc90d28001a0e32afa3d5  saves the
file rapidly, so the save problem was introduced later.

I've used the  bibisect-43all, I append results for both slow loading and slow saving

Slow loading:

milos@trigan:/mnt/data/milos/LO/bibisect-43all$ git bisect good d42e14f46fbe1bfc716dc90d28001a0e32afa3d5 is the first bad commit
commit d42e14f46fbe1bfc716dc90d28001a0e32afa3d5
Author: Bjoern Michaelsen <bjoern.michaelsen@canonical.com>
Date:   Wed Oct 16 15:03:39 2013 +0000

    source-hash-82b6c4884d7b2fbb3d45980785cebba7a159fb10
    
    commit 82b6c4884d7b2fbb3d45980785cebba7a159fb10
    Author:     Fridrich Štrba <fridrich.strba@bluewin.ch>
    AuthorDate: Fri Mar 15 11:01:58 2013 +0100
    Commit:     Fridrich Štrba <fridrich.strba@bluewin.ch>
    CommitDate: Fri Mar 15 11:02:24 2013 +0100
    
        Add typedetection and configuration for MWAW based filters
    
        Change-Id: Icf489a49c547a93458aeef1c70ba699181a47c38

:100644 100644 a476eec2644ce3d840a3525e60685c10b546a0b1 adc2881acded521c47f2f68d855c72f67d727322 M	ccache.log
:100644 100644 989a6f8490d3dc9b6552faaa5d3338ef11d96ca2 318b0a1faf66be7808cf51370d0b620a0e032a24 M	commitmsg
:100644 100644 c3034c36e1045c28850073be1b5eeeab7948599e f63c89c1340345dbf43c35c96a3437548dd813b9 M	dev-install.log
:100644 100644 75d1590884ea160e3916623015bf2d8e8f3f4d63 34bd34bcfd21d5f4b0e4579cd8b1ced9f07fa675 M	make.log
:040000 040000 82cb1a6aa93c11182f124401ee3b61d154ac25e3 687601f861dbaec03323f793f3f80746fcf2a9c6 M	opt

milos@trigan:/mnt/data/milos/LO/bibisect-43all$ git bisect log
# bad: [423a84c4f7068853974887d98442bc2a2d0cc91b] source-hash-c15927f20d4727c3b8de68497b6949e72f9e6e9e
# good: [65fd30f5cb4cdd37995a33420ed8273c0a29bf00] source-hash-d6cde02dbce8c28c6af836e2dc1120f8a6ef9932
git bisect start 'latest' 'oldest'
# bad: [e02439a3d6297a1f5334fa558ddec5ef4212c574] source-hash-6b8393474974d2af7a2cb3c47b3d5c081b550bdb
git bisect bad e02439a3d6297a1f5334fa558ddec5ef4212c574
# good: [8f4aeaad2f65d656328a451154142bb82efa4327] source-hash-1885266f274575327cdeee9852945a3e91f32f15
git bisect good 8f4aeaad2f65d656328a451154142bb82efa4327

# good: [9995fae0d8a24ce31bcb5e9cd0459b69cfbf7a02] source-hash-8600bc24bbc9029e92bea6102bff2921bc10b33e
git bisect good 9995fae0d8a24ce31bcb5e9cd0459b69cfbf7a02
# bad: [8ad82bc1416a07501651e8d96fe268e47d3931d3] source-hash-13821254f88d2c5488fba9fe6393dcf4ae810db4
git bisect bad 8ad82bc1416a07501651e8d96fe268e47d3931d3
# bad: [238338bc4111eb82429ea47384d4012bcd7cdc3e] source-hash-b6ba04639b9922f6717f79ac4be215e09691d7a9
git bisect bad 238338bc4111eb82429ea47384d4012bcd7cdc3e
# good: [89dc8a802d1625e0efd88ba0fb720b22be87f3f0] source-hash-da03bb1ee6a69d2f4fef4c3ca0adc0ba9588bd19
git bisect good 89dc8a802d1625e0efd88ba0fb720b22be87f3f0
# good: [4d8d18a8c871d6803af99b706f780eb6e65c7a5d] source-hash-d4779887636fa9ab5b477f3436bcd3728a3e30ba
git bisect good 4d8d18a8c871d6803af99b706f780eb6e65c7a5d
# bad: [ec6ba885d9d4142060cebdef210d26f0b2ef5045] source-hash-1ce6d6d4133865d9616e12228be2c04cbba1976c
git bisect bad ec6ba885d9d4142060cebdef210d26f0b2ef5045
# bad: [d42e14f46fbe1bfc716dc90d28001a0e32afa3d5] source-hash-82b6c4884d7b2fbb3d45980785cebba7a159fb10
git bisect bad d42e14f46fbe1bfc716dc90d28001a0e32afa3d5
# good: [56260020c5caba00fbd063db5554b452c06116dc] source-hash-34847f1cf7538c333e9b8700eb4012ae358644a6
git bisect good 56260020c5caba00fbd063db5554b452c06116dc
# first bad commit: [d42e14f46fbe1bfc716dc90d28001a0e32afa3d5] source-hash-82b6c4884d7b2fbb3d45980785cebba7a159fb10

------------------------------------------------------------------------------
Slow saving
(changed something, then save as...)

First I did:
git bisect reset
git bisect start latest oldest

milos@trigan:/mnt/data/milos/LO/bibisect-43all$ git bisect good 567bfa79fb5ad4f9dfa05f0dea7666208d6129b2 is the first bad commit
commit 567bfa79fb5ad4f9dfa05f0dea7666208d6129b2
Author: Bjoern Michaelsen <bjoern.michaelsen@canonical.com>
Date:   Thu Oct 17 14:41:08 2013 +0000

    source-hash-4d5fc661d37d03129b8054e494c03bed1933231d
    
    commit 4d5fc661d37d03129b8054e494c03bed1933231d
    Author:     Norbert Thiebaud <nthiebaud@gmail.com>
    AuthorDate: Wed Jun 26 02:19:45 2013 -0500
    Commit:     Michael Stahl <mstahl@redhat.com>
    CommitDate: Wed Jun 26 18:25:20 2013 +0000
    
        coverity#1027651 Dereference after null check
    
        Change-Id: Id90ef9f43578d952fd8d7872d4fd2fd622d07dbd
        Reviewed-on: https://gerrit.libreoffice.org/4529
        Reviewed-by: Michael Stahl <mstahl@redhat.com>
        Tested-by: Michael Stahl <mstahl@redhat.com>

:100644 100644 58d16898db3e2b53044ddbe24459ff162caef262 ed2a925978186d1163103a345e3e74891c1be264 M	autogen.log
:100644 100644 483b58760a06de929b32eafde25a67466c622502 1a1d693ee120f07e474fa8755e03fd495bd4482b M	ccache.log
:100644 100644 fcfaf4eddaf5f8c7a66f90a052cbf2c7473cdc9b 0007455e1b7830707392b6279a9b0f1e2ff8d295 M	commitmsg
:100644 100644 3e023e83e964fd4b90d7bdf45eab489c7382956c cefa06f626579d7f3a84ca802fdc0446756943a3 M	dev-install.log
:100644 100644 47ead046b9af75e2384d8d8f51767edfa54d5dc8 c104ef474e9ba3f3dcddc11cc9deb2d3460fbaa5 M	make.log
:040000 040000 82a20807f2d069e8294cfa6e30778214a869a341 9162a8fc025bf41fc52fd7c394e9fad269f0af0d M	opt


milos@trigan:/mnt/data/milos/LO/bibisect-43all$ git bisect log
# bad: [423a84c4f7068853974887d98442bc2a2d0cc91b] source-hash-c15927f20d4727c3b8de68497b6949e72f9e6e9e
# good: [65fd30f5cb4cdd37995a33420ed8273c0a29bf00] source-hash-d6cde02dbce8c28c6af836e2dc1120f8a6ef9932
git bisect start 'latest' 'oldest'
# good: [e02439a3d6297a1f5334fa558ddec5ef4212c574] source-hash-6b8393474974d2af7a2cb3c47b3d5c081b550bdb
git bisect good e02439a3d6297a1f5334fa558ddec5ef4212c574
# bad: [4850941efe43ae800be5c76e1102ab80ac2c085d] source-hash-980a6e552502f02f12c15bfb1c9f8e6269499f4b
git bisect bad 4850941efe43ae800be5c76e1102ab80ac2c085d
# skip: [a043626b542eb8314218d7439534dce2fc325304] source-hash-9379a922c07df3cdb7d567cc88dfaaa39ead3681
git bisect skip a043626b542eb8314218d7439534dce2fc325304
# skip: [aba65c3e4c0df07e4909aeefb758cdb688242bf6] source-hash-827524abfb4b577d08276fde40929a9adfb7ff1a
git bisect skip aba65c3e4c0df07e4909aeefb758cdb688242bf6
# bad: [c81a8a0dcfc1ed095a80e4485c89dd0fcaf73f31] source-hash-c69ed33628ec0b7abf6296539cf280d6c4265930
git bisect bad c81a8a0dcfc1ed095a80e4485c89dd0fcaf73f31
# bad: [c81a8a0dcfc1ed095a80e4485c89dd0fcaf73f31] source-hash-c69ed33628ec0b7abf6296539cf280d6c4265930
git bisect bad c81a8a0dcfc1ed095a80e4485c89dd0fcaf73f31
# bad: [1d4980621741d3050a5fe61b247c157d769988f2] source-hash-89d01a7d8028ddb765e02c116d202a2435894217
git bisect bad 1d4980621741d3050a5fe61b247c157d769988f2
# bad: [ba096f438393091574da98fe7b8e6b05182a8971] source-hash-8499e78ca03c792f4fa2650e02b519094ba0baa8
git bisect bad ba096f438393091574da98fe7b8e6b05182a8971
# bad: [9daa289e178460daaafa4b3911031df5b8736218] source-hash-704292996a3731a61339b1a4a5c90c9403aa095f
git bisect bad 9daa289e178460daaafa4b3911031df5b8736218
# good: [69bf614869471f46413fe1d2af5976b2e6d85084] source-hash-76dea8b2db906156e77f78738a68f932a15afd4b
git bisect good 69bf614869471f46413fe1d2af5976b2e6d85084
# good: [502c05c771cd993b237febc2d8a20140fe589488] source-hash-462df4920ef50032c8f99a9db2ca34c9cc928657
git bisect good 502c05c771cd993b237febc2d8a20140fe589488
# bad: [567bfa79fb5ad4f9dfa05f0dea7666208d6129b2] source-hash-4d5fc661d37d03129b8054e494c03bed1933231d
git bisect bad 567bfa79fb5ad4f9dfa05f0dea7666208d6129b2
# good: [7d878017eaa2fc1d2eab72689a5e453622d474a2] source-hash-b139f6fedfcf3cbed0eadeb007e2155b576413d2
git bisect good 7d878017eaa2fc1d2eab72689a5e453622d474a2
# good: [83a62c1c1e8e259144e489d9a1f42611eba063c3] source-hash-022c54742e7997bf46a608f1ab0b500f2537f7f5
git bisect good 83a62c1c1e8e259144e489d9a1f42611eba063c3
# first bad commit: [567bfa79fb5ad4f9dfa05f0dea7666208d6129b2] source-hash-4d5fc661d37d03129b8054e494c03bed1933231d
milos@trigan:/mnt/data/milos/LO/bibisect-43all$
Comment 3 Matthew Francis 2015-08-13 06:23:00 UTC
If the loading and saving parts of the report were introduced separately, then this should be split into two bugs

The specific commit at which loading became slow is the below (part of a series of related commits)


commit a8da9a9e6b32a0479bf9fad3624ad10f329e86c1
Author: Kohei Yoshida <kohei.yoshida@gmail.com>
Date:   Tue Mar 12 21:12:22 2013 -0400

    Turns out I can't really use ScColumnImpl.
    
    Because the column source files are split; not all source files had
    access to the impl class definition.
    
    Change-Id: Ic8cff0eeec3504660304be9f020c40bcd98cee85
Comment 4 Matthew Francis 2015-08-13 07:41:40 UTC
Save speed regression split -> bug 93405
Comment 5 Robinson Tryon (qubit) 2015-12-10 10:57:48 UTC Comment hidden (obsolete)
Comment 6 Noel Grandin 2015-12-15 12:43:16 UTC
I have created an issue against libmdds to track what I think is the performance problem here.

https://gitlab.com/mdds/mdds/issues/7
Comment 7 Kohei Yoshida 2015-12-15 14:09:18 UTC
Can you post the callgrind profile, or at the very least do you know who calls get_type() repeatedly?
Comment 8 Kohei Yoshida 2015-12-15 15:43:44 UTC
(In reply to Kohei Yoshida from comment #7)
> Can you post the callgrind profile, or at the very least do you know who
> calls get_type() repeatedly?

Nevermind, we don't need that info.
Comment 9 Commit Notification 2015-12-16 14:58:55 UTC
Markus Mohrhard committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=97361478e64f11330f7f7aea819d60c7562fb0f6

don't call getOutputString when we are not using the result, tdf#92802

It will be available in 5.2.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 10 Noel Grandin 2015-12-17 06:52:45 UTC
Fixed by Markus' commit
Comment 11 Timur 2016-03-24 12:42:15 UTC
Can you please explain how this was fixed?
I have somewhat diff. results from Milos, but still I can't see an improvement with master~2016-03-22_23.57.30_LibreOfficeDev_5.2.0.0.alpha0_Win_x86: 

version       load  save
4.1.6         16s    12s
4.2.0         20s    25s
5.1.1         32s    1:20 min
5.2+          38s    3:30 min
Comment 12 Markus Mohrhard 2016-04-10 04:31:19 UTC
(In reply to Timur from comment #11)
> Can you please explain how this was fixed?
> I have somewhat diff. results from Milos, but still I can't see an
> improvement with
> master~2016-03-22_23.57.30_LibreOfficeDev_5.2.0.0.alpha0_Win_x86: 
> 
> version       load  save
> 4.1.6         16s    12s
> 4.2.0         20s    25s
> 5.1.1         32s    1:20 min
> 5.2+          38s    3:30 min

I just tested again and in current master the file saves in about 10 to 15 seconds.
Comment 13 MM 2016-06-04 20:32:15 UTC
(In reply to Timur from comment #11)
> Can you please explain how this was fixed?
> I have somewhat diff. results from Milos, but still I can't see an
> improvement with
> master~2016-03-22_23.57.30_LibreOfficeDev_5.2.0.0.alpha0_Win_x86: 
> 
> version       load  save
> 4.1.6         16s    12s
> 4.2.0         20s    25s
> 5.1.1         32s    1:20 min
> 5.2+          38s    3:30 min

Re-tested with v5.2.0.0a1 under ubuntu 16.04 x64.
Loads now in 15 secs, saves in 5 secs. That's definitely an improvement.
Comment 14 Timur 2016-06-06 11:56:31 UTC
Re-tested with master~2016-05-20_05.45.40_LibreOfficeDev_5.2.0.0.alpha1_Win_x86 in Windows 7 64bit and with LibreOfficeDev_5.3.0.0.alpha1 in Mint 17.3
Loads on first LO run is 65 sec, loads on second LO run in 38 secs, saves in 5 secs. 
So, filesave is OK, but fileopen is not for me.
Comment 15 Timur 2021-09-13 12:49:06 UTC
I rechecked this with time loexit and time seems OK, with higher cpu.

Fileopen:
OO 3.3
11,59s user 0,83s system 51% cpu 23,923 total
43all oldest/3.5
13,19s user 0,81s system 58% cpu 23,918 total
4.1 m
103,70s user 0,79s system 91% cpu 1:53,77 total
5.0 m
43,83s user 0,82s system 77% cpu 57,274 total
5.2 m
18,17s user 0,71s system 94% cpu 19,908 total
6.2 m
26,51s user 1,73s system 75% cpu 37,603 total
7.1 m
17,16s user 1,25s system 114% cpu 16,095 total
7.3+ m
11,44s user 0,76s system 112% cpu 10,832 total