Bug 153924 - Calc functions SMALL() and LARGE() changed behavior for array formulas
Summary: Calc functions SMALL() and LARGE() changed behavior for array formulas
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
6.2 all versions
Hardware: All All
: medium normal
Assignee: Eike Rathke
URL:
Whiteboard: target:7.6.0 target:7.4.7 target:7.5.2
Keywords: bibisectNotNeeded, regression
Depends on:
Blocks:
 
Reported: 2023-03-02 13:50 UTC by ady
Modified: 2023-06-02 12:50 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
Data area, array formulas, and copied value results depending on LO version (23.61 KB, application/vnd.oasis.opendocument.spreadsheet)
2023-03-02 14:02 UTC, ady
Details

Note You need to log in before you can comment on or make changes to this bug.
Description ady 2023-03-02 13:50:14 UTC
Sample data (starting from cell A1):

195	200	2
151	180	1
148	178	3
189	165	4
183	192	5
154	144	

Note that cell C6 is blank empty; this is key to trigger the changed behavior.

Then use the following function, as array formula, CSE:
=SMALL(A1:B6,C1:C6)


Until LO 6.0 (at least):
148
144
151
154
165
Err:502

Same result is seen in AOO4113m1 (2022-07-01).
Excel returns a similar result.

But, since LO 7.0 (or maybe before that):

Err:504
Err:504
Err:504
Err:504
Err:504
Err:504


Similar changed behavior can be seen when using LARGE().

There are 3 things to note here:

A.
In old versions up until 6.0 (at least), the error is displayed on the relevant cell only. Valid calculations are still shown. Since LO 7.0 (or before that), _all_ cells display the error.

B. According to the Release Notes for LO 6.2, the Rank argument for Calc's small() and large() functions did not support array values before, but I have no problems using these functions and getting correct results when the Rank argument is a range – tested in LO 3.3. So, I guess that the change in LO 6.2 not only affected actual inline array values being accepted, but also how these functions react for corner cases / outside_of_scope data(?). Please note that AOO4113m1 (current version ATM) behaves as older LO versions.

Unfortunately, currently I don't have LO 6.1 and/or 6.2 to test.

C.
* Err:502 means "Invalid argument". Reasonable.
* Err:504 means "Parameter list error". > Inadequate.


I'll attach a sample file soon.
Comment 1 ady 2023-03-02 14:02:33 UTC
Created attachment 185700 [details]
Data area, array formulas, and copied value results depending on LO version

As for the change in LO 6.2:
https://wiki.documentfoundation.org/ReleaseNotes/6.2#Calc

https://bz.apache.org/ooo/show_bug.cgi?id=32345
Comment 2 Eike Rathke 2023-03-02 14:38:33 UTC
Apparently a fallout from

commit e4c2d0bb57ab8ea8f5c400d103d01376b8140f22
CommitDate: Fri Nov 30 22:14:17 2018 +0100

    i#32345 Support a matrix of rank argument for LARGE()/SMALL()

that in ScInterpreter::GetTopNumberArray() requires the entire rank array would have to be numeric, which with an empty cell it isn't.
Comment 3 Eike Rathke 2023-03-02 16:03:39 UTC
What does Excel return if not the last rank row is empty but any in between? An error at the corresponding position? Or still at the last position?
Comment 4 ady 2023-03-02 17:53:05 UTC
(In reply to Eike Rathke from comment #3)
> What does Excel return if not the last rank row is empty but any in between?
> An error at the corresponding position? Or still at the last position?

Excel (online), opening the same ods file, returns the same as the older versions of LO and the same as AOO4113m1 as I reported initially, with the exception of the specific error. The error for Excel is #NUM! and the numeric values are the same, all in the same positions.

In attachment 185700 [details] there is:

* 3 columns with cell C6 blank empty > the result in Excel is the same as old LO versions; just replace the last error result Err:502 with "#NUM!".

* 3 additional columns where the blank empty cell was replaced by a numeric value of 6; Excel again returns the same as old versions of LO and AOO4113m1, with all the numbers in the same place, no error.

Please note that AOO4113m1 (the current version) behaves correctly, as old LO versions do. Apparently whatever change was made by that "32345" ticket (see link in comment 1) in AOO... either it was improved later on, or the result was different than what happened for LO 6.2.
Comment 5 ady 2023-03-02 18:18:26 UTC
FWIW, I repeated the tests, because now I went again to open the same attachment 185700 [details] in Excel to recheck what I did before and there was some message about some error in the ods file. To be clear, some error in some file format, not in the functions.

This time I did saved it xls, using the same LO versions as before, 6.0 and 7.0. Excel did not complain.

At any rate, all the results were, again, as they were reported before.

I'll do it all over again with an ods file, to see whether Excel claims again there is some error in the file. It might have been some glitch when testing with several LO and AOO versions.

The important thing is, as it is now, the change in behavior I reported is still current.
Comment 6 Eike Rathke 2023-03-02 18:33:12 UTC
Please don't test what Excel produces for a file loaded that was saved by LO, but what it does when things are entered in Excel as array formula. Results might differ.

I'm specifically interested in what Excel returns for this rank array with a #DIV/0! error and a blank cell and a string in between:

2
=1/0
3

x
6


I'd assume for SMALL() it would be

148
#DIV/0!
151
#VALUE!
#VALUE!
178

or other error values like #NUM! I don't care which, but the positions.
Comment 7 ady 2023-03-02 18:43:50 UTC
(In reply to Eike Rathke from comment #6)
> Please don't test what Excel produces for a file loaded that was saved by
> LO, but what it does when things are entered in Excel as array formula.
> Results might differ.

OK. Apologies for misleading / being not clear enough. Unintentionally, of course.

Unfortunately, ATM I can only test in the way I already described; with several versions of LO, AOO4113m1, and opening these files in Excel online.

I'll post if I can have a real test in Excel as requested.

I hope the other results are still useful. They are all performed in LO 6.0.0.3 and 7.0.0.3.

Maybe someone else with access to Excel can help in testing. The data, formula and simple steps are all in comment 0 as plain text.
Comment 8 Eike Rathke 2023-03-02 18:52:41 UTC
Maybe Mike (Cc) could help out? i.e. data from comment 0 with rank data C1:C6 from comment 6 and =SMALL(A1:B6,C1:C6) as array formula.
Comment 9 Mike Kaganski 2023-03-02 20:06:17 UTC
(In reply to Eike Rathke from comment #8)
> Maybe Mike (Cc) could help out? i.e. data from comment 0 with rank data
> C1:C6 from comment 6 and =SMALL(A1:B6,C1:C6) as array formula.

148
#DIV/0!
151
#NUM!
#VALUE!
178
Comment 10 Eike Rathke 2023-03-03 11:51:04 UTC
Great, thanks Mike!
As I expected..
Comment 11 Commit Notification 2023-03-04 00:37:05 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/019e751c71dcb2d34c6fd8bb9dda267c6ba2b48e

Resolves: tdf#153924 handle non-numeric and error values in rank array

It will be available in 7.6.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 12 Eike Rathke 2023-03-04 00:38:16 UTC
Pending review
https://gerrit.libreoffice.org/c/core/+/148217 for 7-5
https://gerrit.libreoffice.org/c/core/+/148218 for 7-4
Comment 13 Commit Notification 2023-03-04 20:53:59 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "libreoffice-7-4":

https://git.libreoffice.org/core/commit/6e0c8a3dbdb83b634a30dad018e1ff5f9959d055

Resolves: tdf#153924 handle non-numeric and error values in rank array

It will be available in 7.4.7.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 14 Commit Notification 2023-03-04 20:54:02 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "libreoffice-7-5":

https://git.libreoffice.org/core/commit/c8a033f0bbcc61feef3b42ea6a8833bcbc6ac552

Resolves: tdf#153924 handle non-numeric and error values in rank array

It will be available in 7.5.2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 15 ady 2023-03-05 08:01:28 UTC Comment hidden (no-value)
Comment 16 Mike Kaganski 2023-03-05 08:19:22 UTC
(In reply to ady from comment #15)
> With dev. version (Built: 2023.03.04):
> 
> Version: 7.6.0.0.alpha0+ (X86_64) / LibreOffice Community
> Build ID: d7c609dbb1bd08865b43719d2fb7c316d30bcde5

This build id points to:

https://git.libreoffice.org/core/commit/d7c609dbb1bd08865b43719d2fb7c316d30bcde5

"Fri Mar 03 23:07:01 2023 +0000"

while the comment 11 is from 2023-03-04 00:37:05 UTC.
There's no value to state the obvious, that before that patch had been merged, it didn't work. Additionally:

(In reply to Commit Notification from comment #11)
> The patch should be included in the daily builds available at
> https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours.

48 hours is 2 days; it happens.
Comment 17 ady 2023-03-05 09:00:02 UTC
(In reply to Mike Kaganski from comment #16)
> This build id points to:
> 
> https://git.libreoffice.org/core/commit/
> d7c609dbb1bd08865b43719d2fb7c316d30bcde5
> 
> "Fri Mar 03 23:07:01 2023 +0000"

That's great news (in the sense that I did not check the exact times of the patch, so if this build does not include it, it is great news-for-me because of the result). The build was the first one to be available _after_ I saw comment 11 (i.e. time of comment vs. time of build).

I had already checked this yesterday, and I wanted to test it again today, exactly because I thought that maybe the patch was not yet included, for whichever reason. When I found that there was no new build today, and that the patches for 7.5 and 7.4 were also being pushed, I decided to report what I tested yesterday anyway, just in case. I cannot regret I did, considering what (I thought) I knew _at_the_time_, and that I learned something from this – thank you –, but I do apologize for the resulting noise.

I'll retest when I'll grab a newer build, and I'll report back.
Comment 18 Commit Notification 2023-03-06 15:11:02 UTC
Xisco Fauli committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/d951f9cafbc4674e4bf6fc174f5fd28e451c59ae

tdf#153924: sc_subsequent_filters_test4: Add unittest

It will be available in 7.6.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 19 ady 2023-03-07 15:23:42 UTC
Version: 7.6.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 0484a9a3f5e2ecb678f6fb41bbb251529e89c00d
CPU threads: 4; OS: Windows 10.0 Build 19044; UI render: Skia/Raster; VCL: win
Locale: en-US (es_AR); UI: en-US
Calc: CL threaded
Built 2023.03.07

New set of values (starting from A1):
195	200	2
151	180	1
148	178	3
189	165	a
183	192	13
154	144	


With non-array formula:
=SMALL($A$1:$B$6;$C6)

Results:
7.6.alpha+: #VALUE!
6.0.0.3: Err:502
3.3: Err:502


With array formula (CSE):
=SMALL(A1:B6;C1:C6)

Results:

7.6.alpha+:
148
144
151
#VALUE!
Err:502
#VALUE!


6.0.0.3:
148
144
151
#VALUE!
#VALUE!
Err:502


3.3:
148
144
151
Err:502
#VALUE!
Err:502


Reminder (mainly to myself, just in case):
Err:502 > Function argument is not valid.
#VALUE! > No (valid?) result.


So, are the new (as of this 7.6alpha+ test) "error" results actually in accordance to what should be expected? I would tend to think that version 3.3 is the adequate from the above 3 versions I tested this time, but I could be wrong. I don't know whether there is a standard / expected definition for these errors for these 2 functions (small() and large()).

Could this be fixed too, please? Or are these already correct?
Comment 20 Eike Rathke 2023-03-08 13:09:58 UTC
#VALUE! means No Value. In this case "a" and empty cell are not numeric values. The function gathering numeric values for the rank parameter encountered non-numeric cells.

Err:502 invalid argument is caused by the invalid rank argument 13 for 12 data points.

See https://help.libreoffice.org/7.5/en-GB/text/scalc/05/02140000.html?DbPAR=CALC#bm_id3146797

For the non-array case already the rank values gathering function sets the #VALUE! error that not all cells contain numeric values because individual positional results wouldn't be calculated. That error is propagated.

6.0.0.3 and earlier returned Err:502 for everything because they did not accept any other argument than a single scalar value, hence invalid argument, a range or array evaluation didn't even happen.

There's nothing to be fixed. Also, detailed errors may vary over time. Calc gives more detailed errors than Excel, which for all these cases may produce just one #VALUE! or #NUM!
Comment 21 ady 2023-03-08 13:59:59 UTC
(In reply to Eike Rathke from comment #20)
> #VALUE! means No Value. In this case "a" and empty cell are not numeric
> values. The function gathering numeric values for the rank parameter
> encountered non-numeric cells.
> 
> Err:502 invalid argument is caused by the invalid rank argument 13 for 12
> data points.

Yes, I intentionally used "a;13;blank" in order to trigger the errors.

"a" is clearly not the correct type for rank, so I would had thought that it would return Err:502 (invalid argument).

"13" is the correct type for rank, but out of the range for this case. This is why I thought it would return #VALUE! (No result).

With blank for rank, it is kind of "who knows", because there is no default value for rank. There is surely no (valid) result to be expected, but the source of that is the inadequate argument.

These are the reasons why I quoted the different results from versions 3.3, 6.0, and 7.6alpha+, which don't agree between them. The situation is also seen in the non-array test I posted (different versions > different errors).

If this is all about some implementation that might or might not change at some point, I guess that's the way things work. I just reported what I saw.

By now I also tested LARGE(), with equivalent results.

Thank you for fixing SMALL() and LARGE().