I have a case for the INDEX() function in Calc that behaves differently than in Excel. In Excel, the INDEX() function can process arrays natively, meaning that it can be introduced with [ENTER] and it doesn’t need CSE. Apparently, this is not completely true for the INDEX() function in Calc. The specific result (either an array, or one simple value, or an error) may also be dependent on the second and third arguments too. For an example (file), see the following tutorial about the INDEX() function in Excel, which includes a publicly available file to download, linked within the text "sample workbook". * The file can be downloaded from the tutorial page ("sample workbook"). * The full tutorial starts at: https://www.ablebits.com/office-addins-blog/excel-index-match-multiple-criteria-formula-examples/ * The second section of that same page of the tutorial, which explains how Excel handles the relevant "non-CSE" formula, starts at: https://www.ablebits.com/office-addins-blog/excel-index-match-multiple-criteria-formula-examples/#non-array-multiple-criteria * The incompatibility can be seen in the file named "index-match-multiple-criteria.xlsx" > worksheet "Non-array formula" > cell G4. * In Excel, that formula is introduced with a simple [ENTER] and returns a valid result. In Calc, it seems as if CSE would be required (extra curly braces shown in Calc), and it returns Err:504, which is carried from the inner INDEX() function. * The link to the file is: https://cdn.ablebits.com/excel-tutorials-examples/index-match-multiple-criteria.xlsx (Just for the record, I have no connection to the site nor the products whatsoever.) CC’ing Eike Rathke
This is not a matter of load/save (adjusting summary). Also, it's not about array formula vs non-array formula. The formulas in the example document are explicitly stored as array formulas by Excel, hence Calc correctly displays them in {} curly braces. It's also not that in Calc one would have to enter an array formula for this MATCH() case to get a result if INDEX() worked correctly. It's simply that INDEX() does not return a column or row vector of an array that is already a single column or row vector. E.g. with , comma as array-column separator and ; semicolon as array-row separator: =INDEX({1;2};0;1) (all rows, column 1) should return {1;2} but returns 1 =INDEX({1,2};1;0) (row 1, all columns) should return {1,2} but returns 1 =INDEX({1;2};0;0) (all rows, all columns) correctly returns {1;2} =INDEX({1,2};0;0) (all rows, all columns) correctly returns {1,2} =INDEX({1;2};1;0) (row 1, all columns) correctly returns 1 =INDEX({1,2};0;1) (all rows, column 1) correctly returns 1 =INDEX({1,2;3,4};0;0) correctly returns {0,1;2,3} =INDEX({1,2;3,4};0;1) correctly returns {1;3} =INDEX({1,2;3,4};1;0) correctly returns {1,2}
Note that in Excel the cases =INDEX({1;2};0;0) =INDEX({1,2};0;0) =INDEX({1,2;3,4};0;0) with both row_num=0 and column_num=0 that they document to not work at all and return error. There is no compelling reason though why it shouldn't and it works in Calc. Additionally there is a working case that they do not document, if the array is a one-dimensional row vector then if column_num is not given then the row_num argument acts as a column index. E.g. =INDEX({1,2};2) returns 2, as does =INDEX({1,2};0;2) That is specified by ODFF v1.3 in 6.14.6 INDEX https://docs.oasis-open.org/office/OpenDocument/v1.3/os/part4-formula/OpenDocument-v1.3-os-part4-formula.html#INDEX
Using LO 7.4.5, here is what I have tested ("|" is column separator, ";" as both row and argument separators). In a pre-dynamic formula context, with INDEX() in array form *by itself*: A_ { =INDEX({11|12;21|22};0;1) } (with CSE) =11 (1st row, 1st col) =21 (2nd row, 1st col) OK. B_ =INDEX({11|12;21|22};0;1) (_no_ CSE) =11 (1st row only) OK. C_ { =INDEX({11;21};0;1) } (with CSE) =11 (1st row) NOT OK. We should also get: =21 (2nd row) D_ =INDEX({11;21};0;1) (_no_ CSE) =11 (1st row) OK, but expected considering result "C" above. E_ { =INDEX({11|12};1;0) } (with CSE) =11 (1st col) NOT OK. We should also get: =12 (2nd col) F_ =INDEX({11|12};1;0) (_no_ CSE) =11 (1st col) OK, but expected considering result "E" above. Considering that those are wrong results already, I have not tested additional possibilities such as default values when arguments are omitted, or out-of-scope arguments, or wrong-type arguments. Please note that these are examples of INDEX() by itself. When INDEX() is used as lookup_array for MATCH() (as in the example in comment 0), then CSE should not be required in order to obtain the array to be fed for MATCH(). Once the above examples get to work, I should be able to test them in combination with MATCH() as in the example in comment 0. PS: FWIW, in Excel I believe that =INDEX({1,2};2) should had been #REF!, IIUC (whether with CSE or not).
You are complicating things and just repeating what I gave.. =INDEX({11;21};0;1) CSE is exactly the column vector case I said is not working as it should. =INDEX({11|12};1;0) CSE the same but for row vector. Fwiw, you can see in the Function Wizard what actually is returned for the examples; whether the formula is entered as CSE-array or normal only affects how the final result is displayed, either as full array result (CSE) or just the single top left element's value (normal). > in Excel I believe that =INDEX({1,2};2) should had been #REF! But it isn't, is it?
@Mike: Does full (non-online) Excel do vector replication in INDEX()? i.e. with array-row separator ; semicolon does =INDEX({1;2};0;2) return error, or does it return column vector {1;2}? (transform separators to whatever is correct for your Excel..) Similar, does =INDEX({1;2};2;2) return error, or 2?
(In reply to Eike Rathke from comment #4) > > in Excel I believe that =INDEX({1,2};2) should had been #REF! > > But it isn't, is it? Quote: row_num and column_num must point to a cell within array; otherwise, INDEX returns a #REF! error. Source: https://support.microsoft.com/en-us/office/index-function-a5dcf0dd-996d-40a4-a822-b56b061328bd I cannot test in Excel ATM. Hopefully Mike can. IDK whether a pre-dynamic formula era version of Excel would provide a different result than the current versions in the case of out-of-scope arguments; it shouldn't. I am more concern about Calc providing the expected results in the other (normal) cases, but I do understand the desire to improve the function all around.
(In reply to ady from comment #6) > (In reply to Eike Rathke from comment #4) > > > in Excel I believe that =INDEX({1,2};2) should had been #REF! > > > > But it isn't, is it? > > Quote: > row_num and column_num must point to a cell within array; otherwise, INDEX > returns a #REF! error. In comment 2 I wrote that they did not document it, geez.. There was a .xls file with exactly that behaviour, and it's also implemented by Gnumeric. https://bz.apache.org/ooo/attachment.cgi?id=34659
(In reply to Eike Rathke from comment #5) > @Mike: > Does full (non-online) Excel do vector replication in INDEX()? i.e. with > array-row separator ; semicolon Since I don't really know how to find the respective syntax in Excel, I did the following: I copy-pasted the formula below to a Calc; and saved as XLSX. Then I opened the XLSX in Excel 2016; on my system, it turned out to show as "=INDEX({1\2};0;2)" > does > =INDEX({1;2};0;2) > return error, or does it return column vector {1;2}? (transform separators > to whatever is correct for your Excel..) So, =INDEX({1\2};0;2) gives me 2 in Excel 2016 (I made sure it's not a cached result, entering this formula anew). Since it's not an array formula, I am not sure how to disambiguate possible vector results (and values shown from them by intersection) from scalars? > Similar, does > > =INDEX({1;2};2;2) > > return error, or 2? =INDEX({1\2};2;2) gives me #REF!
(In reply to Eike Rathke from comment #5) > with array-row separator ; semicolon Oh, sorry. I didn't check my Calc settings in the first place; in my case, the ; is the column separator, and | is the row separator. Now let me re-check. So formula "=INDEX({1|2};0;2)" in my Calc arrived as "INDEX({1;2};0;2)" in Excel. It gave me #REF!. Formula "=INDEX({1;2};2;2)" in Excel also gave #REF!.
Thanks Mike, so vector replication is not in effect for both, a vector's element and the entire vector.
Eike Rathke committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/df706f47a2b62248d222911db12c674e6507e5c6 Resolves: tdf#154125 Fix INDEX() one-dimensional vector access It will be available in 7.6.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Pending review https://gerrit.libreoffice.org/c/core/+/148755 for 7-5
Eike Rathke committed a patch related to this issue. It has been pushed to "libreoffice-7-5": https://git.libreoffice.org/core/commit/9820f754f1a8a26568d4d848072fe8bb5f6c04b1 Resolves: tdf#154125 Fix INDEX() one-dimensional vector access It will be available in 7.5.3. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Thank you Eike Rathke. Some nitpicks, if I may. (In reply to Eike Rathke from comment #7) > > Quote: > > row_num and column_num must point to a cell within array; otherwise, INDEX > > returns a #REF! error. > In comment 2 I wrote that they did not document it, geez.. Sincerely, IDK what you meant with that. I simply quoted part of a text that I thought was relevant, and posted the source of what I thought was supposed to be how MS uses/defines/handles the INDEX() function. That was part of their page. If there is another document that is the real source of such info (and incidentally the quoted text was not part of that), I am not aware of it. Now, with LO 7.6alpha from today, when using out-of-scope arguments such as =INDEX({1;2};2;2) ...I would had expected to see the same #REF! error as Excel, considering that ODF uses it for the same purpose (IIUC). Instead, I get Err:502 (Invalid argument). Having said that, older LO versions provide the same Err:502 result too. If error codes are supposed to provide users with helpful clues as to where the problem might reside, I would be more confused with Err:502 than with #REF! for this case. Perhaps I am mixing or misunderstanding the respective meanings of Err:502 and #REF! in LO Calc. Regarding the file linked in comment 0 that triggered this report... I can confirm it is now generating the same numeric result as Excel in the (originally) problematic cell, with the difference of still showing the extra curly braces. I am not saying this is wrong from LO Calc's perspective; I am just describing what I see as user. I can still introduce the same formula in a different cell and the numeric result will still be the same (correct now). I can introduce it in another cell with [ENTER] alone (in which case there are no extra curly braces) or as an array with CSE. IDK which version of Excel generated the original file; perhaps pre-dynamic formula era versions of Excel also show the curly braces in cell G4 whereas newer versions don't(?). Finally, IIUC, there seems to be a mistake (typo) in the definition of INDEX() in ODF. Could some feedback be provided (from here, from TDF, from...) so it can be corrected, at least for future revisions of the standard? Again, thank you Eike and Mike.
Created attachment 185926 [details] sample file
Xisco Fauli committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/25ae332bbafd0cb416a6a5c0db467e501661c679 tdf#154125: sc_uicalc: Add unittest It will be available in 7.6.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
(In reply to ady from comment #14) > Finally, IIUC, there seems to be a mistake (typo) in the definition of > INDEX() in ODF. Could some feedback be provided (from here, from TDF, > from...) so it can be corrected, at least for future revisions of the > standard? Done. https://lists.oasis-open.org/archives/office-comment/202303/msg00003.html
(In reply to Eike Rathke from comment #17) > (In reply to ady from comment #14) > > Finally, IIUC, there seems to be a mistake (typo) in the definition of > > INDEX() in ODF. Could some feedback be provided (from here, from TDF, > > from...) so it can be corrected, at least for future revisions of the > > standard? > Done. > https://lists.oasis-open.org/archives/office-comment/202303/msg00003.html FTR... That email suggested a correction (typo-like) to the ODF standard. ATM, the suggestion to modify the ODF standard according to that email, ended with the complete _removal_ of the problematic paragraph, instead of its modification. That is according to: https://github.com/oasis-tcs/odf-tc/tree/master/docs/odf1.4/part4-formula as of 2023-04-27. Perhaps there is a related issue tracker number with more info (and/or reasoning), but I don't know it. IDK whether the suggested modification (in that email) is better than removing the paragraph entirely. IDK whether removing the paragraph in the ODF standard is adequate. I just wanted to leave this comment for future reference, just in case, and to warn about this matter, in case someone has anything to say, on time, before the next ODF version gets published (whenever that might happen).
From: * https://lists.oasis-open.org/archives/office-comment/202303/msg00003.html From the original ODF 1.3: " If DataSource is a one-dimensional column vector, Column is optional or can be omitted as an empty parameter (two consecutive ;; semicolons). If DataSource is a one-dimensional row vector, Row is optional, which effectively makes Row act as the column offset into the vector, or can be omitted as an empty parameter (two consecutive ;; semicolons). " It was suggested (by Eike) to change it to: " If DataSource is a one-dimensional column vector, Column is optional or can be omitted as an empty parameter (two consecutive ;; semicolons). If DataSource is a one-dimensional row vector, Column is optional, which effectively makes Row act as the column offset into the vector, or can be omitted as an empty parameter (two consecutive ;; semicolons). " The difference is in one word, changing only one of the "Row" to one "Column". I cannot be sure, but I think the suggestion was misinterpreted, considering the similarities of the different paragraphs / sentences. From the Issue Tracker 4143 (and 4144): * https://issues.oasis-open.org/browse/OFFICE-4143 * https://lists.oasis-open.org/archives/office/202303/msg00015.html Quote: " *Patrick:* Andreas - not clear what 3rd paragraph is trying to say or it is saying the same thing as the second paragraph. *Patrick:* Andreas - third paragraph is wrong but the solution is to get rid of it. *Patrick:* Regina - agrees we can delete third paragraph *Patrick:* Create issue for formula - 6.14.6 Index, Eike reporting, delete third paragraph entirely *Patrick:* Regina - Eike's suggestion isn't correct, so just delete third paragraph *Patrick:* Svante agrees *Patrick:* Patrick to: Create issue for formula - 6.14.6 Index, Eike reporting, delete third paragraph entirely *Patrick:* by consent " This ended with the complete deletion of the quoted paragraph.
That third paragraph is of explanatory nature; the function's syntax is tricky, it indeed tried to (wrongly) explain the second paragraph, if they think the paragraph is not needed, shrug..
(In reply to Eike Rathke from comment #20) > That third paragraph is of explanatory nature; the function's syntax is > tricky, it indeed tried to (wrongly) explain the second paragraph The 3rd paragraph _was_ wrong; but it is not clear to me why Regina thinks that the suggested correction is/was also wrong (so, instead, the paragraph was deleted). @Regina, If the suggested correction was wrong, then what would had made this "explanatory" paragraph correct? If the suggested correction was wrong, we (would) have to wonder whether the INDEX() function is now correct (in LO). Will there be interoperability problems (at some point)?