Bug 128812 - LibreOffice - calc - specific xlsx document (almost) any operation (FILEOPEN, FORMATTING) takes about 1 min
Summary: LibreOffice - calc - specific xlsx document (almost) any operation (FILEOPEN,...
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
6.3.2.2 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:6.5.0 target:6.4.0.1 target:6.3.4
Keywords: bibisectRequest, perf, regression
Depends on:
Blocks:
 
Reported: 2019-11-15 04:05 UTC by Robert
Modified: 2019-12-18 12:42 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
MS Excel file that causes some trobules at Calc (186.52 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2019-11-17 18:44 UTC, Robert
Details
Flamegraph (when opening file) (141.09 KB, application/x-bzip)
2019-11-18 19:31 UTC, Julien Nabet
Details
2nd Flamegraph (when saving after having typed a string in a cell) (73.06 KB, application/x-bzip)
2019-11-18 19:38 UTC, Julien Nabet
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Robert 2019-11-15 04:05:19 UTC
Description:
Problem appears rather for one specific document (which I own - see "Additional Info"). Other documents are processed quite ok. I can deliver the file to you if this would help, just write me an e-mail.

Steps to Reproduce:
1. Just open document
2. any changes inside document such editing, filters customizastion etc. 
3.

Actual Results:
any operation takes about 1 min.

Expected Results:
imediate reaction


Reproducible: Always


User Profile Reset: No


OpenGL enabled: Yes

Additional Info:
Preconditions
there is a document that was created in MS Excel (xlsx)
It looks to be that there are special features of the document to be used to get the efect. The most likely one is "automatic color backgrund gradient based on values of set of cells"
Environment:
CPU: i5-8300H
Drive: SSD -M.2 - ~1TB/s
RAM: 12GB
PC: Lenovo IdeaPad 330-15
Comment 1 Julien Nabet 2019-11-17 10:14:43 UTC
2 things you can try:
1) upgrade to 6.3.3
2) Take a look at https://wiki.documentfoundation.org/QA/FirstSteps

If you got Excel, you can try to sanitize it (see https://wiki.documentfoundation.org/QA/Bugzilla/Sanitizing_Files_Before_Submission) so you can attach the file (which will be available for anyone).
Comment 2 Robert 2019-11-17 18:44:09 UTC
Created attachment 155900 [details]
MS Excel file that causes some trobules at Calc

Attached file includes own formulas that helps me to manage my runs in Diablo 2. In Column "AO" (orange cells) I enter number of runs which I own. Then I can see what sets of runs I can use for crafting unique items. Language is Polish as this is my native language. 
I tryied sanitize the file with Calc. After this I saved it with another name as *.xlsx file. It's weird but then file started opening as fast as expected. But there was still an issue with modyfing "AO" column's cells. 
Eventually I attach original file that I recognize as problematic.

Regarding hins:
1) latest version of LO (Calc) is installed at my PC - see below info about version
2) information from the page, have not found useful to me.

Wersja: 6.3.3.2 (x86)
Build ID: a64200df03143b798afd1ec74a12ab50359878ed
Wątki CPU: 8; OS:Windows 10.0; UI render:GL; VCL: win; 
Ustawienia regionalne: pl-PL (pl_PL); Język interfejsu: pl-PL
Calc: threaded
Comment 3 Julien Nabet 2019-11-17 19:53:07 UTC
On pc Debian x86-64 with master sources updated today, I don't even succeed in opening the file after 30s.
Comment 4 Robert 2019-11-17 20:07:58 UTC
The old PC that I used to create this file was with MS Office 2003 ( I guess. I can check later if this is a matter). What is weird that only that one document causes troubles. Others looks to be fine.
Beside an issue with (conditional) formationg, another issues what I suspec might be the problem is that I copied a lot of content of the file from the web. So it might be that there are some hyperlinks that were copied from Firefox to Excel document. No more ideas/clues on my mind.
Comment 5 Julien Nabet 2019-11-18 19:31:51 UTC
Created attachment 155930 [details]
Flamegraph (when opening file)

On pc Debian x86-64 with master sources updated today (without enable-dbgutil), I don't reproduce this but it's still a bit long.
So here's a first Flamegraph (tool to show a perf graph)
Comment 6 Julien Nabet 2019-11-18 19:38:09 UTC
Created attachment 155931 [details]
2nd Flamegraph (when saving after having typed a string in a cell)
Comment 7 Julien Nabet 2019-11-18 20:52:38 UTC
Eike: I took a look at the first Flamegraph and noticed that GetThis called rEntry.GetQueryItem()
(see https://opengrok.libreoffice.org/xref/core/sc/source/core/data/dociter.cxx?r=c2d8341e#1109).
This function resizes the vector maQueryItems to 1
(see https://opengrok.libreoffice.org/xref/core/sc/source/core/tool/queryentry.cxx?r=a6a5064c#118)

but line 1154 still in GetThis, we got:
...rEntry.GetQueryItems().size() == 1...
GetQueryItems returns also maQueryItems
(see https://opengrok.libreoffice.org/xref/core/sc/inc/queryentry.hxx?r=f8edef39#66)

A bit weird, isn't it?
Comment 8 Xisco Faulí 2019-11-19 12:24:09 UTC
it takes

real	2m39,881s
user	2m38,866s
sys	0m0,775s


in

Version: 6.4.0.0.beta1+
Build ID: 1987c98926a85a483a32ea78e460e563a6ea4705
CPU threads: 4; OS: Linux 4.15; UI render: default; VCL: gtk3; 
Locale: ca-ES (ca_ES.UTF-8); UI-Language: en-US
Calc: threaded
Comment 9 Xisco Faulí 2019-11-19 12:34:46 UTC
it takes in

real	0m13,288s
user	0m10,912s
sys	0m1,088s

Version: 5.2.0.0.alpha1+
Build ID: 5b168b3fa568e48e795234dc5fa454bf24c9805e
CPU Threads: 4; OS Version: Linux 4.15; UI Render: default; 
Locale: ca-ES (ca_ES.UTF-8)
Comment 10 Xisco Faulí 2019-11-19 12:35:23 UTC
@Noel, I thought you might be interested in this issue...
Comment 11 Eike Rathke 2019-11-19 14:32:45 UTC
(In reply to Julien Nabet from comment #7)
> A bit weird, isn't it?
Yeah looks a bit odd. A better approach probably would be to let places that expect a single ScQueryEntry::Item in an ScQueryEntry already force that in a strategic place (i.e. here in the ScQueryCellIterator ctor) and remove the check and forcing 1 element from ScQueryEntry::GetQueryItem(), which would need inspecting all places that use it first. Knowing that there is exactly one item anyway the if (rItem.mbMatchEmpty...) could omit the extra check of rEntry.GetQueryItems().size() == 1. Altogether eliminating two size() calls per call that could squeeze those 3% of ScQueryCellIterator::GetThis().
Comment 12 Julien Nabet 2019-11-20 09:20:53 UTC
(In reply to Eike Rathke from comment #11)
> (In reply to Julien Nabet from comment #7)
> > A bit weird, isn't it?
> Yeah looks a bit odd. A better approach probably would be to let places that
> expect a single ScQueryEntry::Item in an ScQueryEntry already force that in
> a strategic place (i.e. here in the ScQueryCellIterator ctor) and remove the
> check and forcing 1 element from ScQueryEntry::GetQueryItem(), which would
> need inspecting all places that use it first. Knowing that there is exactly
> one item anyway the if (rItem.mbMatchEmpty...) could omit the extra check of
> rEntry.GetQueryItems().size() == 1. Altogether eliminating two size() calls
> per call that could squeeze those 3% of ScQueryCellIterator::GetThis().

Thank you for your detailed feedback. It's not an easyhack (at least for me) then and it concerns only 3% anyway.
Can't help here => uncc myself.
Comment 13 Commit Notification 2019-11-20 14:04:33 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/d468958331f36310d11265ba55d7c27366ab58ab

tdf#128812 speed up loading calc doc with lots of countif

It will be available in 6.5.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 14 Xisco Faulí 2019-11-20 14:29:15 UTC
it takes

real	0m9,415s
user	0m9,206s
sys	0m0,210s

in

Version: 6.5.0.0.alpha0+
Build ID: d468958331f36310d11265ba55d7c27366ab58ab
CPU threads: 4; OS: Linux 4.15; UI render: default; VCL: gtk3; 
Locale: ca-ES (ca_ES.UTF-8); UI-Language: en-US
Calc: threaded

Fantastic!
@Noel, thanks for fixing this issue!!
Comment 15 Commit Notification 2019-11-20 22:49:13 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "libreoffice-6-4":

https://git.libreoffice.org/core/commit/e073f996c4ec2582b9560e2fac828c9a73358423

tdf#128812 speed up loading calc doc with lots of countif

It will be available in 6.4.0.1.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 16 Commit Notification 2019-11-21 17:28:38 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "libreoffice-6-3":

https://git.libreoffice.org/core/commit/ad7dc47c3d604f5a130e36eedc8ce266fcd84669

tdf#128812 speed up loading calc doc with lots of countif

It will be available in 6.3.4.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 17 Robert 2019-12-18 11:17:50 UTC
(In reply to Commit Notification from comment #16)
> Noel Grandin committed a patch related to this issue.
> It has been pushed to "libreoffice-6-3":
> 
> https://git.libreoffice.org/core/commit/
> ad7dc47c3d604f5a130e36eedc8ce266fcd84669
> 
> tdf#128812 speed up loading calc doc with lots of countif
> 
> It will be available in 6.3.4.
> 
> The patch should be included in the daily builds available at
> https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
> information about daily builds can be found at:
> https://wiki.documentfoundation.org/Testing_Daily_Builds
> 
> Affected users are encouraged to test the fix and report feedback.

Hello,

I checked if fix works. There is huge improvment regardin opening document that I attached. Using the file is still slow (10-15 seconds after updating some cells or switching filters). It is better than in previous version but still usage if it is "uncomfortable" ;-) Thus I reopened the ticket.
Comment 18 Xisco Faulí 2019-12-18 12:42:00 UTC
(In reply to Robert from comment #17)
> (In reply to Commit Notification from comment #16)
> > Noel Grandin committed a patch related to this issue.
> > It has been pushed to "libreoffice-6-3":
> > 
> > https://git.libreoffice.org/core/commit/
> > ad7dc47c3d604f5a130e36eedc8ce266fcd84669
> > 
> > tdf#128812 speed up loading calc doc with lots of countif
> > 
> > It will be available in 6.3.4.
> > 
> > The patch should be included in the daily builds available at
> > https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
> > information about daily builds can be found at:
> > https://wiki.documentfoundation.org/Testing_Daily_Builds
> > 
> > Affected users are encouraged to test the fix and report feedback.
> 
> Hello,
> 
> I checked if fix works. There is huge improvment regardin opening document
> that I attached. Using the file is still slow (10-15 seconds after updating
> some cells or switching filters). It is better than in previous version but
> still usage if it is "uncomfortable" ;-) Thus I reopened the ticket.

Hi Robert,
Thanks for checking.
Please fill a follow-up bug.