Bug 115490 - VLOOKUP unsorted is (too) slow in larger row sets, filling all cells in the column with the formula
Summary: VLOOKUP unsorted is (too) slow in larger row sets, filling all cells in the c...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
6.1.0.0.alpha0+
Hardware: All All
: medium normal
Assignee: Dennis Francis
URL:
Whiteboard: target:6.1.0
Keywords: perf
Depends on:
Blocks: Function-Vlookup
  Show dependency treegraph
 
Reported: 2018-02-06 15:24 UTC by Cor Nouws
Modified: 2018-10-12 11:31 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
test file for VLOOKUP performance unsorted (203.31 KB, application/vnd.oasis.opendocument.spreadsheet)
2018-02-06 15:25 UTC, Cor Nouws
Details
callgrind output (1.89 MB, text/plain)
2018-02-06 15:32 UTC, Cor Nouws
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Cor Nouws 2018-02-06 15:24:52 UTC
I'll attach a test file.

Take a lookup in 10.000 columns
Unsorted takes ~40 secs on my laptop. Sorted is done in < 1 sec.
(for reference, in 4.0.6.2, unsorted takes more then 4 minutes.)
One might decide that it is good to sort the lookup range.
Indeed.
However… if a value that is seached for, is not available, then there will still be a result returned (see Help on VLOOKUP, 4th argument, (un)sorted)
To prevent this, one could first check if a matching value is available, by use of VLOOKP unsorted  ..
So no joy in that approach.
Comment 1 Cor Nouws 2018-02-06 15:25:57 UTC
Created attachment 139639 [details]
test file for VLOOKUP performance unsorted
Comment 2 Cor Nouws 2018-02-06 15:27:13 UTC
found in Version: 6.1.0.0.alpha0+
Build ID: c7f74bbab4c666a8b3b865dbd58b3666f1f63052
CPU threads: 4; OS: Linux 4.13; UI render: default; VCL: gtk2; 
TinderBox: Linux-rpm_deb-x86_64@70-TDF, Branch:master, Time: 2018-02-03_00:22:13
Locale: nl-NL (nl_NL.UTF-8); Calc: group
Comment 3 Cor Nouws 2018-02-06 15:32:36 UTC
Created attachment 139640 [details]
callgrind output

callgrind dump attached - stopped the process at roughly 1/5 of the full lookup
Comment 4 Cor Nouws 2018-02-06 15:36:14 UTC
  !
  !
  !
Apologies, I should have mentioned in the description that the time is needed to full the full 10,000 rows in column E with the formula.
  !
  !
Comment 5 Commit Notification 2018-04-06 13:58:58 UTC
Michael Meeks committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=76bc1a81d089d9f66639c118c4063d87e4422684

tdf#115490 - avoid transliteration by using SharedString.

It will be available in 6.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 6 Michael Meeks 2018-04-10 20:31:57 UTC
With a series of patches this comes down from 40seconds to 4.5seconds or thereabouts on Dennis' machine - lets mark this fixed for now =)

Thanks !
Comment 7 Cor Nouws 2018-10-12 11:31:52 UTC
works even faster now thanks to Luboš recent commit:
https://cgit.freedesktop.org/libreoffice/core/commit/?id=79449d73900d7a9bf061244d76f5f8eecc441198

(though difference between ~4 and ~3 seconds for the test file in this issue, is not that noticeable)