Bug 66507 - CALC: serious performance problems in ods file, primarily in sheets with charts
Summary: CALC: serious performance problems in ods file, primarily in sheets with charts
Status: RESOLVED NOTABUG
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
4.1.0.1 rc
Hardware: Other All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: Confirmed:4.2.0.3:OSX target:4.3.0 ta...
Keywords:
: 75194 (view as bug list)
Depends on:
Blocks:
 
Reported: 2013-07-02 13:59 UTC by Ákos
Modified: 2016-04-12 20:53 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
Problematic ods (847.52 KB, application/vnd.oasis.opendocument.spreadsheet)
2013-07-02 13:59 UTC, Ákos
Details
New, clear file, I copied only the content in it (550.28 KB, application/vnd.oasis.opendocument.spreadsheet)
2014-01-24 13:41 UTC, Ákos
Details
Same with original file, but without indirect functions (271.34 KB, application/vnd.oasis.opendocument.spreadsheet)
2016-04-11 07:38 UTC, Ákos
Details
same file in ms excel (414.61 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2016-04-12 08:18 UTC, Ákos
Details
Deleting charts and adding a Pivot Table with a chart. (255.91 KB, application/vnd.oasis.opendocument.text)
2016-04-12 20:53 UTC, m.a.riosv
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ákos 2013-07-02 13:59:04 UTC
Created attachment 81884 [details]
Problematic ods

When I open or save the ods is very slow, but when I modify any value and I select a sheet with charts goes extremly slow, and the cpu useage is 100% for 20-40 seconds (3,5 GHz i5 CPU)
This is a very simple home use table, and it's almost imposibble to work with it.
Comment 1 m.a.riosv 2013-07-03 01:36:56 UTC
Reproducible with:
Win7x64Ultimate
Version 3.6.7.1 (Build ID: 9418c72)
Version 4.0.4.2 (Build ID: 9e9821abd0ffdbc09cd8c52eaa574fa09eb08f2)
Version 4.1.0.1 Build ID: 1b3956717a60d6ac35b133d7b0a0f5eb55e9155
Version 4.2.0.0.alpha0+Build ID: 9bd0a976017fadf4344cbf6746abd395dfcc3cb7
                TinderBox: Win-x86@6, Branch:master, Time: 2013-06-29_22:42:02

Version 3.5.7.2 can not open the file
Comment 2 Ákos 2013-07-17 09:18:45 UTC
I test with Apache OpenOffice 4.0 RC1 and LibreOffice 4.1 RC2 the open time, in the same computer:
AOO 4.0 RC1:
Open: 52 seconds
Change to Napok sheet: 26 seconds
Total: 1 minute 18 seconds

LO 4.1 RC2:
Open: 34 seconds
Change to Napok sheet: 41 seconds
Total: 1 minute 15 seconds
Comment 3 Ákos 2014-01-23 16:54:33 UTC
I try the file opening on:
CPU: AMD Kaveri A10-K7700 (8 GB DDR3-1333, SSD)
LibreOffice 4.2.0 RC2:
Open: 20 seconds
Change to Napok sheet: +30 seconds
Total: 50 seconds
Comment 4 Ákos 2014-01-24 11:38:11 UTC
Test on another computer:
CPU: Intel Core i5-3550 (8 GB DDR3-1333, SSD)
LibreOffice 4.2.0 RC2:
Open: 15 seconds
Change to Napok sheet: +20 seconds
Total: 35 seconds

Interesting, because calc theoretically is optimized for AMD Kaveri HSA architecture, and on same cpu clock (3,4 GHz, and 3,7 GHz turbo core) lost with 30%.
Comment 5 retired 2014-01-24 12:30:09 UTC
Confirmed:4.2.0.3:OSX

Trying to save that file I get CPU spikes and LO does not respond for a while.

http://opendocumentfellowship.com/validator shows several glitches with this document:

Output of validation test file 'slow.ods'
This file is not Valid OpenDocument 1.0!

    error

    in content.xml
    /www/TOM/_temp/Z446eLXRo1AcNv69midWrxYw9wJv1mmU/content.xml:0: Relax-NG validity error : Expecting element table-header-rows, got table-row
    /www/TOM/_temp/Z446eLXRo1AcNv69midWrxYw9wJv1mmU/content.xml:0: Relax-NG validity error : Element table failed to validate content
    /www/TOM/_temp/Z446eLXRo1AcNv69midWrxYw9wJv1mmU/content.xml:0: Relax-NG validity error : Invalid attribute value-type for element table-cell
    /www/TOM/_temp/Z446eLXRo1AcNv69midWrxYw9wJv1mmU/content.xml:0: Relax-NG validity error : Expecting element table-header-rows, got table-row

So if this document is flakey maybe this is expected behavior.

Akos: Hm, you could try creating a new spreadsheed and copy pasting data over. Then re-save. Does then the issue persist? That way we could make sure the document is not corrupted.
Comment 6 Ákos 2014-01-24 13:41:01 UTC
Created attachment 92727 [details]
New, clear file, I copied only the content in it
Comment 7 Ákos 2014-01-24 13:49:44 UTC
I create a new file, and copy the content from the old in it, I was recreate all graphs, but after validation I got same errors with new file.
And retest the opening times:

CPU: AMD Kaveri A10-K7700 (8 GB DDR3-1333, SSD)
LibreOffice 4.2.0 RC2 - windows 7 x64:
Open: 8 seconds
Change to Napok sheet: +42 seconds
Total: 50 seconds
Save the file: !!!!!65 seconds

CPU: Intel Core i5-3550 (8 GB DDR3-1333, SSD)
LibreOffice 4.2.0 RC2 - windows 7 x64:
Open: 5 seconds
Change to Napok sheet: +30 seconds
Total: 35 seconds
Save the file: 17 seconds


On AMD Kaveri arhitecture the file save time is increased dramatically, and when I move the mouse over the toolbar the CPU usage for 3 seconds is 100%.
I try to reproduce this performance bug under linux.
Comment 8 Ákos 2014-01-24 18:59:43 UTC
I make the same test on AMD Kaveri A10-7700K under Ubuntu 13.10 amd64 linux, native install on SSD:

CPU: AMD Kaveri A10-K7700 (8 GB DDR3-1600, SSD)
LibreOffice 4.2.0 RC3:
Open: 7 seconds
Change to Napok sheet: +35 seconds
Total: 42 seconds
Save the file: 50 seconds

I try with origilanlly installed LO: 4.1.3:
Open: 6 seconds
Change to Napok sheet: +37 seconds
Total: 43 seconds
Save the file: 60 seconds
Comment 9 Ákos 2014-02-04 08:24:43 UTC
Retest the newly created file, with LibreOffice 4.2.0.4 final release:

AMD Kaveri A10-7700K (8 GB DDR3-1600, SSD)
Windows 7 sp1 x64:
LibreOffice 4.2.0.4:
Open: 9 seconds
Change to Napok sheet: 42 seconds
Save and close the file: 64 seconds !!!!!

AMD Kaveri A10-7700K (8 GB DDR3-1600, SSD)
Linux - Ubuntu 13.10 x64:
Open: 7 seconds
Change to Napok sheet: 38 seconds
Save and close the file: 60 seconds !!!!!

Intel i5-3550 (8 GB DDR3-1333, SSD)
Windows 7 sp1 x64:
LibreOffice 4.2.0.4:
Open: 6 seconds
Change to Napok sheet: 28 seconds
Save and close the file: 44 seconds

Intel i5-3550 (8 GB DDR3-1333, SSD)
Linux - Ubuntu 13.10 x64:
Open: 5 seconds
Change to Napok sheet: 27 seconds
Save and close the file: 38 seconds
Comment 10 Michael Meeks 2014-02-04 13:00:10 UTC
Hi Ákos - first thanks for your review and data here; we're looking into this. A quick query - have you enabled the OpenCL support which takes advantage of OpenCL for calculation ? Tools->Options->LibreOffice Calc->Formula->Custom->Details - and enable OpenCL support.
It'd also be interesting to know which OpenCL backend you are using, the driver versions installed etc. =) Naturally GPU acceleration doesn't have a massive impact on load/save time - and recent optimisation there was focused on XLSX.
Anyhow - looking forward to your feedback.
Thanks for caring about performance & testing !
Comment 11 Ákos 2014-02-04 13:56:52 UTC
Hi,
I don't enable the OpenGL support.
Now I enable it, and retest the three steps, and I got the same results.

In my opinion this performance problem results from formula calculation speed or from chart draw/calculation speed, because the spreadsheet is relatively small and not very complex.
Comment 12 Rene Engelhard 2014-02-04 14:00:23 UTC
Akos: Open*C*L, not OpenGL :) (C instead of G)
Comment 13 Michael Meeks 2014-02-18 22:11:31 UTC
Up-loaded a callgrind profile of the 4-2-0 branch here:

http://users.freedesktop.org/~michael/fdo-66507-load-switch-save-37376041d4c30588fc577c20ec91a14e819d439a.txt.gz

For a load / switch to the difficult tab / and save as ODS; help targetting the identified functions appreciated =)
Comment 14 Michael Meeks 2014-02-19 15:54:46 UTC
It is curious that switching sheets to Napokra -appears- to trigger re-calculation of the chart's source ranges each time; no idea why that would be unless there is some always-re-calc setting being provoked there. The children of TabBar::Select look interesting in that regard; it eventually calls:  ScChart2DataSequence::BuildDataCache() which seems to trigger a lot of the calculation: though why it should do that repeatedly is not immediately clear to me: there are no macros in that sheet to dirty anything.

Ultimately, it seems that the sheer number of SUMPRODUCTS over large ranges containing unpleasantly branchy 'YEAR', 'WEEKNUM', 'WEEKDAY' etc. methods makes the calculation somewhat slow.
Comment 15 m.a.riosv 2014-02-19 16:03:10 UTC
I think there were some work involving this functions, about issues on calculations with empty cells.
https://bugs.freedesktop.org/show_bug.cgi?id=74479
Comment 16 m.a.riosv 2014-02-20 03:33:51 UTC
*** Bug 75194 has been marked as a duplicate of this bug. ***
Comment 17 Commit Notification 2014-02-20 21:28:17 UTC
Michael Meeks committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=ae838de0a3840e0fd490db5447bd6d481616e9d7

fdo#66507 - accelerate date calculations as used vs. empty cells.



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 18 Michael Meeks 2014-02-21 11:19:42 UTC
Just pushed a speedup to master; one of the problems is optimising sumproduct with these nested week / day functions; an initial attempt noticing that we do:

 747011 calls to getMonth for 0 and getYear for 0

ie. for all the blank cells (the sumproduct is over a big range with ~nothing in it) - I shortcut the Date += operator to avoid quite some work; ~2 lines of code.

That gives a 12% win on switching to the slow sheet (forcing calculate) and some real but smaller win on load time.

Clearly there is plenty more to do here still =)
Comment 19 Ákos 2014-03-20 14:39:44 UTC
Retest with LibreOffice 4.2.2.1

AMD Kaveri A10-7700K (8 GB DDR3-1600, SSD)
Windows 7 sp1 x64:
LibreOffice 4.2.0.4:
Open: 9 seconds
Change to Napok sheet: 42 seconds
Save and close the file: 26 seconds (before: 64s)

Intel i5-3550 (8 GB DDR3-1333, SSD)
Windows 7 sp1 x64:
LibreOffice 4.2.2.1:
Open: 6 seconds
Change to Napok sheet: 28 seconds
Save and close the file: 18 seconds (before: 44s)

The save and close time changed dramaticaly.
After change to Napok sheet and make any blank operation (select A1 -blank cell- and push Delete button, I need to wait anatother 42 vs 28 seconds)
Comment 20 Ákos 2014-07-08 09:22:01 UTC
Retest with LibreOffice 4.3.0.2

AMD Kaveri A10-7700K (8 GB DDR3-1600, SSD)
Windows 7 sp1 x64:
LibreOffice 4.3.0.2:
Open: 13 seconds
Change to Napok sheet: 40 seconds
Save and close the file: 26 seconds

The change to Napok sheet time, is still a problem, and if I want to edit anything in the sheet, I need to wait 30-40 seconds too.
Comment 21 Michael Meeks 2014-07-08 10:50:12 UTC
The DateToDays calls that take 10% (39bn cycles) are ~all of the same value (at least for me):

2365774 	DateToDays 30 12 1899
      1 	DateToDays 8 7 2014

I've added a shortcut case there to save that - not a stunning win, but something at least.
Comment 22 Commit Notification 2014-07-08 13:33:34 UTC
Michael Meeks committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=a2b44216f1b1e8d7f4f293e13b257f49ae13de61

fdo#66507 - accelerate common datum date conversion to days.



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 23 Ákos 2014-09-22 12:20:11 UTC
I retest with 4.4.0.0.alpha0 22-Sep-2014 02:48 build: 935e8fc98c033680029e4531747a2f680f50d5ca
The results on my Intel computer is aprox the same.

CPU: Intel Core i5-3550 (8 GB DDR3-1333, SSD)

Open: 5 seconds
Change to Napok sheet: 26 seconds
Change one cell in Napok Sheet: 25 seconds
Save and close the file: 16 seconds
Comment 24 Ákos 2014-11-10 10:42:21 UTC
I retest with 4.4.0.0.alpha2 10-Nov-2014 02:48 build: 24f0a5815f581dd9a7f09d30213a379edee6e9ac
The results on my Intel computer is:

CPU: Intel Core i5-3550 (8 GB DDR3-1333, SSD)

Open: 5 seconds
Change to Napok sheet: 22 seconds (-3s)
Change one cell in Napok Sheet: 9 seconds (-16s)
Save and close the file: 30 seconds (+14s)
Comment 25 Ákos 2015-02-02 15:08:26 UTC
I retest with 4.4.0.3 build: de093506bcdc5fafd9023ee680b8c60e3e0645d7
The results on my Intel computer is:

CPU: Intel Core i5-3550 (8 GB DDR3-1333, SSD)

Open: 33 seconds  (!!! +27s)
Change to Napok sheet: 22 seconds
Change one cell in Napok Sheet: 9 seconds
Save and close the file: 27 seconds (-3s)

The open time is increased dramatically.
Comment 26 Ákos 2015-06-04 07:26:36 UTC
I retest with 5.0.0.0.beta1, 32 bit
The results on my Intel computer is:

CPU: Intel Core i5-4590 - I CHANGE MY CPU - (8 GB DDR3-1333, SSD)

Open: 6 seconds
Change to Napok sheet: 22 seconds
Change one cell in Napok Sheet: 9 seconds
Save and close the file: 26 seconds

You fix the regression created between 4.4.0.0.alpha2 and 4.4.0.3, but the rest is the same.
Comment 27 Michael Meeks 2015-06-04 12:12:30 UTC
Hi Alex,

> Change to Napok sheet: 22 seconds
>
> You fix the regression created between 4.4.0.0.alpha2 and 4.4.0.3,
> but the rest is the same.

Great - it's good to be no worse. Ultimately this spreadsheet is a bit of an issue computationally ... and the implementation of the relevant date functions can be improved a lot if someone is interested in that.

For me; I don't really see doing more work on this now that the regression is solved. What is needed next is a good new kcachegrind profile of the sheet switching / re-calculation I guess - done on a build with debugging symbols, but no extraneous dbgutil code to get a view of where the next problems are.

Until we have that - this won't move forward - any takers ? =)
Comment 28 Ákos 2015-12-28 16:05:43 UTC
I retest with 5.1.0.1, 32 bit

CPU: Intel Core i5-4590

Open: 6 seconds
Change to Napok sheet: 22 seconds
Change one cell in Napok Sheet: 10 seconds
Save and close the file: 26 seconds

Same like in LO 5.0.0.
Comment 29 m.a.riosv 2016-04-09 01:02:56 UTC
Testing the file against 5.1, I think the problem is not related to the charts, but the lot of SUMPRODUCT() functions:
- more than ten thousand.
- with several ranges every one for conditions comparison.
- a lot of them with large ranges (five thousand cells).
- the majority of them with the INDIRECT() function for every range.

If I'm wrong INDIRECT() it's a volatile function, so it is recalculated every time a cell it's calculated, what makes even worse the situation.

So for me there it's not too much that can be done without change the design.

Maybe with all data in one sheet, so INDIRECT() can be eliminated, and perhaps using a pivot table to shorten the data.
Comment 30 Ákos 2016-04-11 07:38:22 UTC
Created attachment 124242 [details]
Same with original file, but without indirect functions

After I eliminate all the indirect functions (delete all sheets that use it):

Open: 23 seconds
Change to Napok sheet: 1 seconds
Change one cell in Napok Sheet: 15 seconds
Save and close the file: 13 seconds

Almost the same worst results.
Comment 31 Michael Meeks 2016-04-11 08:10:26 UTC
I guess it might be possible to accelerate the time & date functions but it is a ton of engineering work that I can't fund =)

I suppose getting another cachegrind profile of the latest re-calculate is a pre-requisite for doing anything here; which is something ~anyone can do:

valgrind --tool=callgrind --simulate-cache=yes --dump-instr=yes ./soffice.bin --splash-pipe=0 $*

etc. =) but the world + wife has: "make my spreadsheet calculate faster" style bugs.
Comment 32 Ákos 2016-04-12 08:18:01 UTC
Created attachment 124270 [details]
same file in ms excel
Comment 33 Ákos 2016-04-12 08:18:14 UTC
I make some tests compared with Microsoft Excel 2016 and you are right.
The problem is the huge number of calculations.
Excel is a little faster, because use all my cpu cores, but probably LO will use all cores too, in short time...

I don't imagine before, that a relatively simple financial analysis result this huge calculation time...

Time comparation with second, cleared table:
LibreOffice 5.1.2.2 32 bit, windows 10
Open: 23 seconds
Change to Napok sheet: instant
Change one cell in Napok Sheet: 9 seconds
Save and close the file: 16 seconds

Microsoft Excel 2016 64 bit, windows 10
Open: 6 seconds
Change to Napok sheet: instant
Change one cell in Napok Sheet: 5 seconds
Save and close the file: 6 seconds

In open time and save and close time, remain the huge difference between LibreOffice and Microsoft Excel, but without multi threading support, I thing, that is normal.

I close the bug, because is not a bug.
Comment 34 m.a.riosv 2016-04-12 20:53:30 UTC
Created attachment 124287 [details]
Deleting charts and adding a Pivot Table with a chart.

Introducing in your last sample a value in Napokra.A26 takes about 24 seconds, even that cell it's not part of any chart.

Doing the same after deleting the charts in the sheet takes about 2-3 seconds.

Moving horizontally through Napokra sheet with the charts in the window it's really slow, specially when a chart begins to be visible.

With my sample, where I deleted the charts and added three columns on m1 to have the values for insert a pivot table avoiding the formulas.

Introducing a value in the Pivot table sheet with a new chart, has no delay, but the problem of move horizontally remains.

It's like if charts update has high priority. Similar seems to happen with conditional formatting.

So I think there is a bug on the charts with ranges over heavy formulas.