Bug 144770 - Help and wiki for Calc MATCH() function is wrong about ascending range lookup in non-sorted arrays.
Summary: Help and wiki for Calc MATCH() function is wrong about ascending range lookup...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Documentation (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Eike Rathke
URL:
Whiteboard: target:7.3.0
Keywords:
Depends on:
Blocks:
 
Reported: 2021-09-28 14:44 UTC by David Lynch
Modified: 2021-09-28 17:09 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
Small spreadsheet for bug 14470 (23.79 KB, application/vnd.oasis.opendocument.spreadsheet)
2021-09-28 14:45 UTC, David Lynch
Details

Note You need to log in before you can comment on or make changes to this bug.
Description David Lynch 2021-09-28 14:44:01 UTC
Description:
See attached spreadsheet. 

The formulas in column K are to find the position of the first empty cell in the first few cells of the corresponding row. The results in rows 2 and 3 are wrong, and those in rows 3,4 and 5 are right. Row 6 is the the same as rows 2 and 3, yet the result in column 5 is different.

An explanation of the formula in K2:
1/(A2:G2="") is {~DIV/0!;~DIV/0!;~DIV/0!;~DIV/0!;1;~DIV/0!;~DIV/0!}
Match is looking for 2, and 
"If Type = 1 or the third parameter is missing, the index of the last value that is smaller or equal to the search criterion is returned. This applies even when the search array is not sorted. For Type = -1, the first value that is larger or equal is returned. "
So match should find index 5, which it does not do. (But it does in rows 3 ,4 and 5).

Here is a text copy of my spreadsheet:
[row1 empty]
23	87	q	a		es	1	9	q		#N/A	=MATCH(2,1/(A2:G2=""))
23	87	q	a		es	1	9	q		#N/A	=MATCH(2,1/(A3:G3=""))
23	87	q			es	1	9	q		5	=MATCH(2,1/(A4:I4=""))
23	87	q	a		es	1	9	q		5	=MATCH(2,1/(A5:I5=""))
23	87	q			es	1	9	q		5	=MATCH(2,1/(A6:G6=""))
Same results after resetting user profile.

Steps to Reproduce:
See attached spreadsheet

Actual Results:
See attached spreadsheet

Expected Results:
in rows 3, 4 and 5


Reproducible: Always


User Profile Reset: Yes



Additional Info:
Version: 7.2.1.2 (x64) / LibreOffice Community
Build ID: 87b77fad49947c1441b67c559c339af8f3517e22
CPU threads: 4; OS: Windows 10.0 Build 19043; UI render: default; VCL: win
Locale: en-GB (en_GB); UI: en-GB
Calc: threaded
Comment 1 David Lynch 2021-09-28 14:45:06 UTC
Created attachment 175326 [details]
Small spreadsheet for bug 14470
Comment 2 Xisco Faulí 2021-09-28 15:32:14 UTC
@Eike, I thought you might be interested in this issue
Comment 3 Eike Rathke 2021-09-28 15:37:06 UTC
Why do you think this is supposed to work? Unlike the similar construct with LOOKUP(2;1/...) (which is a special case), MATCH() for an array passed does not ignore error values nor does it magically sort the array, so with
{~DIV/0!;~DIV/0!;~DIV/0!;~DIV/0!;1;~DIV/0!;~DIV/0!}
a range search, as the third argument is not given the data MUST be strictly sorted ascending, in such unsorted array may yield arbitrary results.

The cited help text "This applies even when the search array is not sorted" is simply wrong. I don't know where the author got that idea from.
Comment 4 Eike Rathke 2021-09-28 15:45:00 UTC
Let's fix the help documentation.. it's in there since 2004 *cough*.
Comment 5 David Lynch 2021-09-28 16:08:12 UTC
The reason I thought it would work is that the sentence I included in the bug report: "This applies even when the search array is not sorted." appears on a page that appears to be from the Document Foundation:  <https://wiki.documentfoundation.org/Documentation/Calc_Functions/MATCH>

You say LOOKUP is a exception. In <https://wiki.documentfoundation.org/Documentation/Calc_Functions/LOOKUP>
"If LOOKUP cannot find the SearchCriterion, it matches the largest value in the search vector that is less than or equal to the search criterion."
Does this mean that LOOKUP can never return #NA#?
Comment 6 Eike Rathke 2021-09-28 16:50:02 UTC
Of course it can, if it doesn't encounter a value that would match the range search evaluation. For example =LOOKUP(2;1/A1:A2) returns #N/A if A1:A2 are both empty or 0 or one of them is < 0.5 (whether first or second is sufficient for the failure depends on the search algorithm, which is binary search); or if all values are greater than the query. Or even in =LOOKUP(1;A1:A2) if A1:A2 contain =NA() (again, one may suffice, depending on algorithm whether it picks first or second value first).

Ignoring errors is only for generated arrays from an inline expression to make that =LOOKUP(2;1/logical_expression) behaviour from Excel work where the array contains unsorted 1 and #error results.

That wiki page you mention also says "the search vector for the LOOKUP must be sorted ascending, otherwise the search will not return any usable results". That's key.
Comment 7 Commit Notification 2021-09-28 17:08:25 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/help/commit/5a4f9fe6afa18531f7403df99e555ea04d96e9b6

Resolves: tdf#144770 MATCH() remove plain wrong statement about unsorted array