Bug 92749 - VLOOKUP not calculated correctly in the large file from Calc 4.4 (hard recalc helps)
Summary: VLOOKUP not calculated correctly in the large file from Calc 4.4 (hard recalc...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
4.4.0.2 rc
Hardware: All All
: medium normal
Assignee: Eike Rathke
URL:
Whiteboard: target:5.1.0 target:4.4.6 target:5.0....
Keywords: regression
Depends on:
Blocks:
 
Reported: 2015-07-15 03:00 UTC by Laurie
Modified: 2016-10-25 19:20 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
Workbook for cost accounting (733.71 KB, application/vnd.oasis.opendocument.spreadsheet)
2015-07-15 03:00 UTC, Laurie
Details
Workbook open in LO and shown what's wrong (227.43 KB, image/jpeg)
2015-07-23 09:32 UTC, Timur
Details
Workbook for Calculating Photocopy/Print Charges (123.81 KB, application/vnd.oasis.opendocument.spreadsheet)
2015-07-27 05:09 UTC, Laurie
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Laurie 2015-07-15 03:00:40 UTC
Created attachment 117241 [details]
Workbook for cost accounting

When a Workbook gets larger than about 730,000 bytes when saved, chained lookups no longer complete in looking up new datasets when a search criteria changes. There is a workaround. Go to Tools/Options/Libre Office Calc/Calculate and change the number of steps. Any change. Even reduce to 2. Closing and re-opening the file brings the problem back. LO 4.1 still opens and calculates this file correctly without the workaround.
Comment 1 Timur 2015-07-15 09:58:41 UTC
Can you please be specific on the expected and obtained result? What exactly is wrong and where?
Can you prepare a minimal test case? Does this happen with a specific file or with all files?
Did you test with a recent versions (4.4.5) and current development versions (5.0)?
Comment 2 Laurie 2015-07-16 00:55:12 UTC
Response to the question from Timur.
On the file I uploaded, Worksheet “Translation” in Column “AB” the first line above the “#N/A” reports the number of user charges which ought to be listed above that line.
When first opened, the count corresponds with the number of lines of user charges and they are looked up correctly.
If I change the selection criteria in cell “AI5” from “1” to “0” the new count appears but the lines above do not report correctly.
All of the lines still appear and the cells within some lines are selecting data from different lines in the source table.
That is the problem.
With “AI5” = “0” these lines should report a subset of the data which was displayed when “AI5” = “1”.

Now alter the number of steps in “Tools – Options – Libre Office – Calculate. Any alteration; even downward.
From then on until the Workbook is closed it recalculates correctly.
That will demonstrate what this set of lookups is supposed to do.
I doubt that it is the number of iterations that is relevant because a downward change works too.
I suspect that something else gets “set” as the change is recorded and that window is closed.
Maybe some other change would have the same effect but this one works for me.

This was not noticed while the file was smaller; I don’t remember how much smaller.
The file grows as new data is added each month.

I have only one L.O. Calc file of anywhere near this size.
I have been using MS_EXCEL extensively but am trying to make the switch.

In L.O. 4.0 this file opens and re-calculates correctly.
The file was created in L.O. 4.0
In 4.4.4.1 and 4.4.4.3 we have the problem.
Where can I get 4.4.5 and/or 5.0? (Assuming that these are versions for Windows systems.)
Comment 3 Timur 2015-07-16 17:05:39 UTC
I see now. Looks like started from 4.4.0.2. Worked OK with 4.3.7.2. Then 4.4 beta 1 wasn't OK but 4.4.0.1 was. Also reproduced in LO 5.1+. Also in Linux.

Hard recalc Ctrl+Shift+F9 helps. 

I'll confirm the bug, but I kindly ask you to verify this is not a duplicate by searching for keywords and testing. Please check Bug 87142. 

Using Separate Install GUI tool from https://flosmind.wordpress.com/si-gui/ you can have all those versions in clean portable mode that will not affect your Windows. 
Personally, I'm using x.y.last for production which means I'll switch to 4.4 with 4.4.6.
Comment 4 Laurie 2015-07-17 07:28:20 UTC
Is the hard recalc ctrl+shift+F9 a toggle in that it remains in effect? It seems to have the same effect as my changing the number of steps did.

I looked at bug 87142. Might be related but I was unable to duplicate this one in a small file like that one shows in.

I am happy with the workarounds.
File submitted in the hope that it may help the development team or somebody else who is having problems.

Thanks.
Laurie.
Comment 5 Joel Madero 2015-07-18 04:41:20 UTC
I'm happy to bibisect this bug but I really hope to see clear and simple steps - not paragraphs (no offense just really hard to follow it). Something as simple as:

1) Change value AI5 from 1 to 0
2) Look at cell.....A#

Expected:
Observed:

I can't follow the comments well enough to feel comfortable bibisecting this as it stands now.
Comment 6 Laurie 2015-07-20 03:12:47 UTC
Cannot make it quite that simple but here goes.
I will use the same Sheet.Cell references used in the formulae. 
To make sure that we are looking at the same data Check that:-
   Print.V3 = 201505
   and Translation.AI3 = 1
If not, change as needed, save, close, and re-open.)

Expected: Translation.AB33 indicates the number of lines of user charges (32) above it. These lines are the "Charges_not_equal_zero lines of the table Summary.A118:N363
All as expected so far because the worksheet has been fully recalculated.

Change Translation.AI3 to 0 (zero).

Expected: Translation.AB28 indicates a lesser number of lines of user charges (27) above it. This is a subset of the previous (32) lines.

Observed: Translation.AB33 indicates the number of lines of user charges which should be above AB28 but there are still 32 lines and the data shown in those lines is not reliable.

Notes: 1) The data in the table in Worksheet = Summary changes when Translation.AI3 is changed.
       2) If Print.V3 is not = 201505 a lot of lines in the Summary and Translation sheets will be different. Print.V3 selects a (year and) month.
Comment 7 Joel Madero 2015-07-20 03:38:27 UTC
I'm still 100% confused so I suggest you do the bibisect (or anyone else who understands the report). To be quite frank - it's very unlikely that a developer will look at this because the sample sheet is so complicated...devs hate complicated examples because it becomes increasingly annoying to find where things go wrong in our 10 million lines of code when there's a bunch of extra garbage to worry about that is 100% irrelevant to the issue. I'm not sure how the sheets interact but the more you cut it down, the more likely a developer will look at it.

If you run Linux I suggest trying to bibisect: https://wiki.documentfoundation.org/QA/Bibisect/Linux
Comment 8 Laurie 2015-07-21 00:30:48 UTC
This will probably be my last comments on this.
1) I can live with the hard recalc.
2) I have found that file size is not important. A 270K file still behaves this way.
3) As a relative newbie to Libre Office Calc I was rather put off when the Workbook which worked fine in version 4.1 gave garbage in 4.5  
4) (for Joel)
   Does a change in the index column of a table referenced by VLOOKUPs trigger re-evaluation of those VLOOKUPs? I suspect the core of the problem is that the values in the index column of the underlying table Summary.A118:M363 are among those which change and that changes to an index column do not flag for re-evaluation of the lookups dependant on them. But this is remedied by changing the number of steps of calculation or by the Ctrl+Shift+F9 combination.
If this is a programming choice to speed re-calculation of the most commonly encountered Workbooks I agree with it. Somebody who is sufficiently technically literate to design a Workbook complex enough to need it is literate enough to apply the simple workaround.
5) Similar wrong results appear when a different month of data is selected by changing Print.V3  Again, the index column of the underlying table in the Summary Sheet is changing.
6) Thanks for the help.
Comment 9 Timur 2015-07-23 09:32:22 UTC
Created attachment 117387 [details]
Workbook open in LO and shown what's wrong

While normally it's expected to have minimal test case prepared, and of course it's better, I think that this report is clear enough. 
I thank Laurie for the report and the explanation. I attach a picture that shows what is wrong.
Comment 10 Timur 2015-07-23 15:27:08 UTC
Not fixed by nice work that Mr. Eike Rathke did in Bug 31577 or Bug 91411. 
I add him in CC. Please feel free to remove.
Comment 11 Eike Rathke 2015-07-24 10:20:08 UTC
Geez.. that will take a week alone to figure out the calculation chain and another 2 weeks to analyse what's going wrong... ;)
Comment 12 Timur 2015-07-24 12:35:16 UTC
Laurie, as I said
> I thank Laurie for the report and the explanation. 
but please try to prepare a minimal test case / file in which this still can be reporoduced.
Comment 13 Laurie 2015-07-27 05:09:48 UTC
Created attachment 117460 [details]
Workbook for Calculating Photocopy/Print Charges
Comment 14 Laurie 2015-07-27 07:21:42 UTC
Workbook reduced to Calculation of Photocopy/Print charges.
This is where the problem was most obvious.
Could probably reduce the file size further but this is probably minimal complexity.

The flow is;-
a) by lookups from the PCopy sheet to the Summary Sheet.
b) Calculations of charges on the Summary Sheet.
The results of these calculations including Column A are affected by the Value in I3 (formerly AI3) of the Translation worksheet.
c) Lookups to the Translation Worksheet eliminate the lines with charge = 0.
These Lookups reference column A of the Summary sheet to locate the next line with a non-zero charge.
If one of these lookups references from a value (say 10) in this column and that value (10) subsequently changes (say to a 9) without triggering a re-evaluation of that lookup that could explain what we are seeing.

So the calculations operate on the results of lookups and other lookups operate on the results from the calculations.
Comment 15 Commit Notification 2015-08-18 09:38:24 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=f7e493229bd949066b4d8984dce7678b8687d1ae

Resolves: tdf#92749 invalidate lookup caches after initial hard recalc

It will be available in 5.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 16 Eike Rathke 2015-08-18 10:21:11 UTC
Pending review
https://gerrit.libreoffice.org/17829 for 5-0
https://gerrit.libreoffice.org/17830 for 5-0-1
https://gerrit.libreoffice.org/17832 for 4-4
Comment 17 Commit Notification 2015-08-18 11:42:02 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "libreoffice-4-4":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=78c988f3ebbcd84832ca671dfed16ce1664f3bfe&h=libreoffice-4-4

Resolves: tdf#92749 invalidate lookup caches after initial hard recalc

It will be available in 4.4.6.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 18 Commit Notification 2015-08-18 11:42:08 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "libreoffice-5-0":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=6855a8a9a7d44be5a020ef609536b76c2dfe05d2&h=libreoffice-5-0

Resolves: tdf#92749 invalidate lookup caches after initial hard recalc

It will be available in 5.0.2.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 19 Commit Notification 2015-08-18 12:01:49 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=1bea8310747b65516f40f6457ab1d174ef7ddce4

introduce temporary hard-recalc state, tdf#92749 follow-up

It will be available in 5.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 20 Commit Notification 2015-08-20 14:59:56 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "libreoffice-5-0-1":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=3dee14a0386c67868748b21226ad871a95cef4a0&h=libreoffice-5-0-1

Resolves: tdf#92749 invalidate lookup caches after initial hard recalc

It will be available in 5.0.1.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 21 Commit Notification 2015-10-30 18:30:35 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=a2c8358c99d465b8396931fb0bddec0a013031af

tdf#95419 fix performance fall-out, tdf#92749 follow-up

It will be available in 5.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.