Bug 96290 - Calc doesn't answer for several minutes when searching and replacing
Summary: Calc doesn't answer for several minutes when searching and replacing
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
(earliest affected) release
Hardware: All All
: medium normal
Assignee: Not Assigned
Keywords: perf
Depends on:
Blocks: Find-Search
  Show dependency treegraph
Reported: 2015-12-06 14:39 UTC by jean-baptiste
Modified: 2021-08-28 15:21 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:
Regression By:

The csv file who kills search and replace (685.70 KB, text/csv)
2015-12-06 14:39 UTC, jean-baptiste

Note You need to log in before you can comment on or make changes to this bug.
Description jean-baptiste 2015-12-06 14:39:19 UTC
Created attachment 121081 [details]
The csv file who kills search and replace


I created a CSV file with Fedora translation status and had to repace "." with "," so libreoffice understand it's a number to use in a pilot table. I use french locale.

I did the search and replace by selecting columns D to HO. It took several minutes to completed. Gnome told me libreoffice was not answering anymore. If we ignore the warning, LibreOffice come back.

Selecting only the zone with data is the same, several minutes without response.
Memory looks ok, but CPU usage is 100% on one core (I have 4 available).

I assume it's a bug

I use latest version from Fedora repositories : https://bodhi.fedoraproject.org/updates/?packages=libreoffice

You'll find the document as attachment of this bug.
I'll be pleased to help testing.
Comment 1 m.a.riosv 2015-12-07 01:15:58 UTC
Hi @jean, thanks for reporting.

With Win10x64
Version: (x64) Build ID: 2def61bcbb29a7a8611b833682fe1291910b11ad-GL

about a couple of minutes, what seems too long. The strange is that seems the replace is done quickly but then takes a lot of time to finish.

Anyway you can import the data with the correct comma in two ways:

- Select as language in the import English (UK).
- Click on the head of first column with data and while Shift key is pressed click on the head of last column with data, after that on Column type select English (US).
Comment 2 QA Administrators 2017-01-03 19:41:50 UTC Comment hidden (obsolete)
Comment 3 jean-baptiste 2017-01-03 22:26:26 UTC
I do confirm this bug still exist with Fedora 25 and LibreOffice 5.2:
Build ID:
Threads CPU : 8; Version de l'OS :Linux 4.8; UI Render : par défaut; 
Locale : en-US (fr_FR.UTF-8); Calc: group

If you do not click anywhere, it's a performance issue, it doesn't crash.
If you click a little bit on screens/content, it crash the software.
Comment 4 Thomas Woltjer 2017-02-13 14:53:07 UTC
Continues to be a problem in on Manjaro.
Comment 5 QA Administrators 2018-02-14 03:37:11 UTC Comment hidden (obsolete)
Comment 6 Timur 2019-11-25 08:35:26 UTC
This was reported in Linux but it's also Windows. 
I see Calc was very slow with LO 5.0 but it's better with current LO, I tested master 6.5+. I see Search Results very fast, but still some "not responding" when I try to close it.

Slow replace is many times reported, I put some in See Also.
General impression is that it was fast before (like LO 3.5) than very slow (like LO 4.2) and that's it somewhat slow now. 
As written in https://bugs.documentfoundation.org/show_bug.cgi?id=83141#c12 
From 6.3 Search Results can be disabled but doing so doesn't help in this case. As explained, Search Results is quickly shown and sluggishness comes after.

Maybe we can keep open, to find out where "not responding" comes from.
Comment 7 Andreas Heinisch 2021-08-28 12:21:34 UTC
The problem arises in [1] where all the found cells will be marked which takes time for 6974 cells. We may limit the selection which should increase the performance?

[1] https://opengrok.libreoffice.org/xref/core/sc/source/ui/view/viewfun2.cxx?r=ef38b9af#2030
Comment 8 Andreas Heinisch 2021-08-28 15:21:27 UTC
Imho, in both cases (Bug 96290 and Bug 123461) the culprit is ScRangeList::Join which tries to join the marked ranges. It holds a list of ranges and checks if it can join with the found one, otherwise it adds the new created ranges and continues to join the remaining ranges from the search result.

The internal data structure is a vector and the data is as follows:
[0] = {aStart={nRow=1 nCol=1 nTab=0 } aEnd={nRow=2 nCol=1 nTab=0 } }
[1] = {aStart={nRow=4 nCol=1 nTab=0 } aEnd={nRow=6 nCol=1 nTab=0 } }
[2] = {aStart={nRow=9 nCol=1 nTab=0 } aEnd={nRow=15 nCol=1 nTab=0 } }
[3] = {aStart={nRow=18 nCol=1 nTab=0 } aEnd={nRow=20 nCol=1 nTab=0 } }
[4] = {aStart={nRow=22 nCol=1 nTab=0 } aEnd={nRow=24 nCol=1 nTab=0 } }
[5] = {aStart={nRow=26 nCol=1 nTab=0 } aEnd={nRow=26 nCol=1 nTab=0 } }
[6] = {aStart={nRow=28 nCol=1 nTab=0 } aEnd={nRow=29 nCol=1 nTab=0 } }
[7] = {aStart={nRow=32 nCol=1 nTab=0 } aEnd={nRow=35 nCol=1 nTab=0 } }
[8] = {aStart={nRow=37 nCol=1 nTab=0 } aEnd={nRow=37 nCol=1 nTab=0 } }
[9] = {aStart={nRow=40 nCol=1 nTab=0 } aEnd={nRow=42 nCol=1 nTab=0 } }
[10] = {aStart={nRow=44 nCol=1 nTab=0 } aEnd={nRow=55 nCol=1 nTab=0 } }

The algorithm always loops over all ranges to check whether a range can be joined or not. A new range may look like the following:
rNewRange = {aStart={nRow=66 nCol=14 nTab=0 } aEnd={nRow=67 nCol=14 nTab=0 } }

In the end it gets even worse, because if a range can be joined, the function tries to join the newly created range as well. In this case, the range contains about 6974 ranges which leads to this performance issue.

Imho, a vector of ranges maybe the wrong data structure when there a fast access is needed.

So either we show only around a 1000 marked ranges, or we have to think about a better algorithm which can join ranges faster :(