Bug 161592 - MATCH() fails to AutoCalculate depending on formula location / size of lookup array (steps in comment 11)
Summary: MATCH() fails to AutoCalculate depending on formula location / size of lookup...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
6.0.4.2 release
Hardware: All All
: high major
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: bibisected, bisected, regression
Depends on:
Blocks: Cell-Formula
  Show dependency treegraph
 
Reported: 2024-06-15 22:07 UTC by PHolder
Modified: 2024-06-21 07:41 UTC (History)
9 users (show)

See Also:
Crash report or crash signature:


Attachments
Example spreadsheet (46.39 KB, application/vnd.oasis.opendocument.spreadsheet)
2024-06-15 22:07 UTC, PHolder
Details
Sample file to test (55.99 KB, application/vnd.oasis.opendocument.spreadsheet)
2024-06-16 16:01 UTC, m_a_riosv
Details
Same calculation but different amount of formulas (68.83 KB, application/vnd.oasis.opendocument.spreadsheet)
2024-06-16 21:07 UTC, Regina Henschel
Details
XMATCH and XLOOKUP (78.09 KB, application/vnd.oasis.opendocument.spreadsheet)
2024-06-16 23:48 UTC, Regina Henschel
Details
revert patch (1.25 KB, patch)
2024-06-21 05:12 UTC, Stéphane Guillou (stragu)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description PHolder 2024-06-15 22:07:42 UTC
Created attachment 194753 [details]
Example spreadsheet

I am trying to make a simple mortgage payment spreadsheet.  I want to use the MATCH() function to show the number of payment periods... by finding the last positive cell in the column containing the outstanding principle.  When I initially enter the match() formula, the value shows correct.  If I make changes to the payment amount, which changes the values in the column in which match() is doing the lookup, it doesn't update the lookup index.  If I change the formula to another column which has a related value, it updates to the expected value.

Reproducing using the supplied example.  The MATCH() is in cell G3.  When it loads, it should show the expected value of 305.

Change the value in cell C5 from $3000 to $5000.  The new value that should appear in G3 should be 116 or 117 (I switch between doing the lookup on column B or G to get it to update because those are related values.)  So if I switch the formula, I get the new value.  If I switch it back, I get the old value.  It's like it's cached or something silly.
Comment 1 m_a_riosv 2024-06-15 22:36:10 UTC Comment hidden (obsolete)
Comment 2 PHolder 2024-06-15 22:58:16 UTC Comment hidden (obsolete)
Comment 3 ady 2024-06-16 01:29:49 UTC
Apparently, either there is something wrong with the MATCH() function, or there is something wrong with AutoCalculate.

STR:
1. Open attachment 194753 [details].
2. Manually change cell C5 from 3000 to 5000
2.1. Note cell G3 does not change to 117
3. Manually change cell C5 from 5000 to 3000 (not using Undo).
3.1. Note cell G3 changes to 117, the value that corresponds to cell C5=5000, not 3000.

Depending on the numeric value used in C5, I can see the result in G3 updating correctly, or updating to a wrong result (previously correct), or not updating its calculation.

I don't know whether some setting is influencing the behavior; for instance menu Tools > Options > Calc > Formula > (Detailed Calculation Settings).

Whichever the case, the behavior is confusing.

CC'ing Eike.
Comment 4 Regina Henschel 2024-06-16 14:07:44 UTC
If I write the same formula to G4, the formula in G4 recalculates as expected.

If I move the formula from G4 to G2, then G2 does not recalculate, but G3 does. It seems, the top-most formula in the column does not recalculate.

If I use the forced recalculate mark "==" instead of the simple "=" for both formulas, then that in G3 recalculates, but that in G4 does not recalculate.

If I use =XLOOKUP(0;B10:B329;A10:A329;;1;-2) or =XMATCH(0;B10:B329;1;-2) it results #N/A and gives the correct result only after "Recalculate Hard".

For me it looks like a more general error in AutoCalculate.
Comment 5 m_a_riosv 2024-06-16 16:01:48 UTC
Created attachment 194761 [details]
Sample file to test

Attached the sample, reducing the rows up to 257, and adding a second MATCH, with a new column C.

One MATCH is recalculated and the other not.
But clearing row 257, both, works fine.

Also adding a +B10*0 on C11, and copy down, seems the MATCH works fine.

Doesn't matter if OpenCL is enabled or not.
Comment 6 ady 2024-06-16 16:35:09 UTC
(In reply to Regina Henschel from comment #4)
> If I write the same formula to G4, the formula in G4 recalculates as
> expected.
> 
> If I move the formula from G4 to G2, then G2 does not recalculate, but G3
> does. It seems, the top-most formula in the column does not recalculate.

That happens only on the same column G or before it. When using the same formula on column L (or any other after G), then recalculation is correct, as long as there is the same formula "before" the new one.

On cell G3 (or the "first" located), [F9] does not work either; Hard Recalc does work on it.

Observation:
It would seem as if the recalculation works on columns (and then on rows) "after" the precedent, failing on the _first_ encounter, wherever that first might be located.



> 
> If I use the forced recalculate mark "==" instead of the simple "=" for both
> formulas, then that in G3 recalculates, but that in G4 does not recalculate.

Even with double equal sign, the first encounter fails, wherever that first MATCH() formula is located.

> 
> If I use =XLOOKUP(0;B10:B329;A10:A329;;1;-2) or =XMATCH(0;B10:B329;1;-2) it
> results #N/A and gives the correct result only after "Recalculate Hard".

Wherever the formula is located, a Hard Recalculate seems to work, also on MATCH(). Simple Recalculate fails on the "first" formula.

> 
> For me it looks like a more general error in AutoCalculate.

Not just AutoCalculate; there is some kind of "old cache" behavior, and a "first encountered location" too, at least using this MATCH() function.
Comment 7 ady 2024-06-16 16:49:48 UTC
(In reply to m_a_riosv from comment #5)
> Created attachment 194761 [details]
> Sample file to test

I apologize for my ignorance. I don't see what this new attachment proves. The only difference is that instead of using values of 3000 to 5000 in cell C5, we have to use much higher values (15000 or 20000) in order to arrive to the lookup value of zero within a lower amount of rows in the table (257 or less).

I see the same behaviors as described in comment 6. "First" encountered location of MATCH() fails to recalc, holding on some kind of prior/latest cache. Hard recalc works.
Comment 8 ady 2024-06-16 17:07:18 UTC
(In reply to m_a_riosv from comment #5)
> Created attachment 194761 [details]
> Sample file to test

Well, after all, using this new attachment has one advantage. It is easier to see the "old cache" behavior on the "first" located MATCH().

STR:
1. Open attachment 194761 [details].
2. Cell C5: 15000.
2.1. Note cell H3 recalcs to 30.

3. Cell C5: 20000.
3.1. Note cell G3 recalcs to 30. > old
3.1. Note cell H3 recalcs to 22. > OK

4. Cell C5: 17000.
4.1. Note cell G3 recalcs to 22. > old
4.1. Note cell H3 recalcs to 26. > OK

5. Cell C5: 18000.
5.1. Note cell G3 recalcs to 26. > old
5.1. Note cell H3 recalcs to 25. > OK

6. Cell C5: 19000.
6.1. Note cell G3 recalcs to 25. > old
6.1. Note cell H3 recalcs to 23. > OK
Comment 9 Telesto 2024-06-16 18:48:53 UTC
(In reply to ady from comment #3)
> Apparently, either there is something wrong with the MATCH() function, or
> there is something wrong with AutoCalculate.
> 
> STR:
> 1. Open attachment 194753 [details].
> 2. Manually change cell C5 from 3000 to 5000
> 2.1. Note cell G3 does not change to 117
> 3. Manually change cell C5 from 5000 to 3000 (not using Undo).
> 3.1. Note cell G3 changes to 117, the value that corresponds to cell
> C5=5000, not 3000.

Not OK with
Versie: 6.0.4.1
Build ID: a63363f6506b8bdc5222481ce79ef33b2d13c741
CPU-threads: 4; Besturingssysteem: Windows 6.3; UI-render: GL; 
Locale: nl-NL (nl_NL); Calc: CL

Working as fine with
Versie: 5.3.5.2 
Build ID: 50d9bf2b0a79cdb85a3814b592608037a682059d
CPU-threads: 4; Besturingssysteem:Windows 6.2; UI-render: GL; Layout Engine: old; 
Locale: nl-NL (nl_NL); Calc: CL

and with
Version: 5.2.5.0.0+
Build ID: a4d4fbeb623013f6377b30711ceedb38ea4b49f8
CPU Threads: 4; OS Version: Windows 6.2; UI Render: GL; 
TinderBox: Win-x86@62-merge-TDF, Branch:libreoffice-5-2, Time: 2016-12-24_14:43:55
Locale: nl-NL (nl_NL); Calc: CL
Comment 10 ady 2024-06-16 19:09:23 UTC
(In reply to ady from comment #8)

> Well, after all, using this new attachment has one advantage. It is easier
> to see the "old cache" behavior on the "first" located MATCH().

I am amending all my "first" located MATCH() to "some" locations.

I have created a similar layout from scratch on a new empty spreadsheet, and using the same MATCH() function in different cells/locations. Some cells get updated correctly, but some others show the "old cache" behavior. I was not able to identify the exact pattern/logic, as they change depending on the relative location of other cells with the same function/formula.

(In reply to Telesto from comment #9)

> Working as fine with

FWIW, not fine with LO 3.3, but maybe some of the changes in recalculation (and hard recalc) in later versions (including in 7.3) might have changed this behavior more than once.
Comment 11 Regina Henschel 2024-06-16 21:07:01 UTC
Created attachment 194767 [details]
Same calculation but different amount of formulas

I have made an example from scratch. I see the error depending on the amount of formulas on the sheet. The two sheets only differ, that in one sheet the range goes to row 813 (fail) whereas in the other sheet it goes to row 812 (good). To test the sheets change the value in cell B2.

I have tested the document in AOO4, in a portable OOo3.2.0 and in LO3.5.4. They fail too for 813 and they are good for 812.

There had been some bug reports about curious calculations in these group of functions, that could not be reproduced at that time. Perhaps you need a document with a large enough amount of formulas to see the defect.
Comment 12 ady 2024-06-16 22:40:41 UTC
(In reply to Regina Henschel from comment #11)
> Created attachment 194767 [details]
> Same calculation but different amount of formulas
> The two sheets only differ, that in one sheet the
> range goes to row 813 (fail) whereas in the other sheet it goes to row 812
> (good). To test the sheets change the value in cell B2.

The size of the range for the search array in MATCH() impacts the behavior (too).

STR:
1. Open attachment 194767 [details], worksheet "fail813".
2. Cells B3 and B4:
2.1. change them from 
 =MATCH(0,B11:B813,-1)
to:
 =MATCH(0,$B$11:$B$812,-1)

We now have the same amount of formulas, but the range used within MATCH() is 1 cell smaller.

3. Change cell B2 to 15000, then to 7000; both cells B3 and B4 recalc correctly.

So:
* It is not just the amount of formulas.

* The relative location of the formulas impacts the behavior.

* The size of the range(s) used within MATCH() impacts the behavior.


6.0.3.2 > all OK
<https://git.libreoffice.org/core/+log/8f48d515416608e3a835360314dac7e47fd0b821>
6.0.4.2 > fails
<https://git.libreoffice.org/core/+log/9b0d9b32d5dcda91d2f1a96dc04c645c450872bf>
Comment 13 Regina Henschel 2024-06-16 23:48:45 UTC
Created attachment 194768 [details]
XMATCH and XLOOKUP

The situation for XLOOKUP and XMATCH is worse. Not even "Recalculate Hard" gives the correct results. You need to save and reload the file.

Load document. Change value in cell B2. Notice that results become #N/A.
Save and reload. Now the values are correct.

I have not made a new bug report because XLOOKUP and XMATCH use the same code in interpr1.cxx as MATCH and thus a common reason is likely. If a fix for MATCH does not fix XLOOKUP and XMATCH a follow up bug report can be created then.
Comment 14 ady 2024-06-17 01:20:28 UTC
(In reply to Regina Henschel from comment #13)
> Created attachment 194768 [details]
> XMATCH and XLOOKUP

For consistency with prior tests and results, I modified _both_, the range in the functions and the data table, beyond 824 rows (so, at least 813 rows of data).

For simplicity, I also modified the references from relative to absolute:

 =XLOOKUP(0;$B$11:$B$813;$A$11:$A$813;;1;-2)

Using absolute references, the results are correct and instantly displayed – only _one_ save and reload is needed; no need to save and reload then.

So I cannot repro the same problem with XLOOKUP() in the same exact way as the original MATCH().

Not being a developer myself, I am searching for problems "blindly" (but still with order and with some logical steps). There might be something in common with the MATCH() case, but I do not know what exactly that would be.

BTW, the double equal "==" disappears automatically when I use absolute references.

Testers should be mindful of the values for testing. For instance in attachment 194768 [details], the values used in cell B2 must always generate a descending-sorting data – no less than 1500 for cell B2 in attachment 194768 [details]; for simplicity I use 15000 to 20000. Additionally, if eventually needed for unit tests, we can create simpler spreadsheets/formulas.

IMHO, this report deserves a higher priority, and more so _if_ the new XLOOKUP/XMATCH() functions are affected (even if partially/somehow).
Comment 15 PHolder 2024-06-18 00:07:55 UTC Comment hidden (obsolete)
Comment 16 Telesto 2024-06-18 00:52:17 UTC
(In reply to PHolder from comment #15)
>  In my opinion, since the entire point
> of the program is to do CORRECT calculations, anything that breaks that is a
> major problem, and thus why I think a step up is appropriate. 

I agree
Comment 17 Xisco Faulí 2024-06-20 14:57:31 UTC
(In reply to Regina Henschel from comment #11)
> Created attachment 194767 [details]
> Same calculation but different amount of formulas
> 
> I have made an example from scratch. I see the error depending on the amount
> of formulas on the sheet. The two sheets only differ, that in one sheet the
> range goes to row 813 (fail) whereas in the other sheet it goes to row 812
> (good). To test the sheets change the value in cell B2.
> 
> I have tested the document in AOO4, in a portable OOo3.2.0 and in LO3.5.4.
> They fail too for 813 and they are good for 812.
> 
> There had been some bug reports about curious calculations in these group of
> functions, that could not be reproduced at that time. Perhaps you need a
> document with a large enough amount of formulas to see the defect.

bibisect-linux64-6.1 points me to

author	Eike Rathke <erack@redhat.com>	2018-03-23 13:13:03 +0100
committer	Eike Rathke <erack@redhat.com>	2018-03-23 13:14:51 +0100
commit c53235ba9aab844c4b16ecf7a28429e256bace6d (patch)
tree 311352c29fbe8e94d648926a50c6bc5c407a3771
parent 604df2f608c03a59be65822d27babf0b00714bd6 (diff)
Enable binary search for numeric values again

but if I revert the patch locally it doesn't fix the issue...
Comment 18 ady 2024-06-20 17:42:21 UTC
(In reply to Xisco Faulí from comment #17)

> bibisect-linux64-6.1 points me to
> commit c53235ba9aab844c4b16ecf7a28429e256bace6d (patch)
> Enable binary search for numeric values again

Also in:
 c3ca9d933b289df49e45f223493f3a22a1f74d14 

matching the range in comment 12, which includes factors other than the steps in comment 11.
Comment 19 PHolder 2024-06-20 20:47:11 UTC
I saw somewhere where there was an option for "parallel calculations".  I wonder if that doesn't somehow cause an interaction here?  Certainly I could see where you can't allow changes to the search range at the same time as you're trying to search within the range itself.
Comment 20 ady 2024-06-20 21:44:32 UTC
(In reply to Xisco Faulí from comment #17)
> but if I revert the patch locally it doesn't fix the issue...

@Xisco, is that the reason not to change this to BIBISECTED?

FWIW, there have been changes to recalculation in other versions, later/newer than 6.1 too.


(In reply to PHolder from comment #19)
> I saw somewhere where there was an option for "parallel calculations"

Multi-threading makes no diff, neither OpenCL.
Comment 21 Stéphane Guillou (stragu) 2024-06-21 04:11:44 UTC
Balázs, wondering if you are interested because you looked at that section of sc/source/core/data/queryiter.cxx recently.

(In reply to Xisco Faulí from comment #17)
> but if I revert the patch locally it doesn't fix the issue...
What was the revert patch exactly?
Comment 22 Stéphane Guillou (stragu) 2024-06-21 05:12:55 UTC
Created attachment 194871 [details]
revert patch

The attached revert patch solves it for me.
Comment 23 Xisco Faulí 2024-06-21 07:41:43 UTC
(In reply to Stéphane Guillou (stragu) from comment #22)
> Created attachment 194871 [details]
> revert patch
> 
> The attached revert patch solves it for me.

Indeed. it does the same for me

Adding Cc: to Eike Rathke