Bug 149832 - VLOOKUP Returns Inaccurate Value when it can't find a lookup string
Summary: VLOOKUP Returns Inaccurate Value when it can't find a lookup string
Status: RESOLVED DUPLICATE of bug 141989
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
7.3.2.2 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-07-03 21:44 UTC by zeffur
Modified: 2022-12-28 20:55 UTC (History)
0 users

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description zeffur 2022-07-03 21:44:00 UTC
Description:
The VLOOKUP function returns the wrong value when it cannot find a lookup string in an alphabetized table range. For example:  Suppose you have a 4 column table range with names in the first column & various numbers in the 3 columns to the right of the names column.  When I use the VLOOKUP function to find the name "Tom" & lookup the value in the 4th column and there is no record with "Tom" in it, VLOOKUP searches as far as it can in the alphabetized table range (let's say to the name Sue) & then it returns the value in the forth column for the "Sue" record when it couldn't find a "Tom" record.  That should NOT happen, obviously.  When a search string cannot be found in a table range, the VLOOKUP should return a "N/A" or no value in the VLOOKUP results cell.

Steps to Reproduce:
1.Build a 4 column spreadsheet
2.Make the 1st column text names (e.g.: Ann, Ben, Sue, Zac)
3.Input numbers into columns 2 through 4 for each name record
3.Use the VLOOKUP function to search for "Tom" & lookup the number in column 4 for the "Tom" record.

VLOOKUP will return the number in column 4 for the "Sue" record when it cannot find a "Tom" record.  That's not supposed to happen, obviously.  It should return no value a "N/A" or similar response.

Actual Results:
VLOOKUP will return the number in column 4 for the "Sue" record when it cannot find a "Tom" record.  That's not supposed to happen, obviously.  It should return no value a "N/A" or similar response.

Expected Results:
It should return no value a "N/A" or similar response.


Reproducible: Always


User Profile Reset: Yes



Additional Info:
Version: 7.3.2.2 (x64) / LibreOffice Community
Build ID: 49f2b1bff42cfccbd8f788c8dc32c1c309559be0
CPU threads: 8; OS: Windows 10.0 Build 19044; UI render: Skia/Raster; VCL: win
Locale: en-US (en_US); UI: en-US
Calc: threaded
Comment 1 Mike Kaganski 2022-07-03 22:14:32 UTC

*** This bug has been marked as a duplicate of bug 144229 ***
Comment 2 Mike Kaganski 2022-07-04 06:53:20 UTC
Let me explain the user error.

As explained in VLOOKUP documentation [1], it has the optional fourth "SortedRangeLookup" argument, that sets its search mode, and is TRUE by default (when omitted). The TRUE value means that the lookup column contains ranges, and the return value is taken from the row with greatest lookup column value *not greater than* lookup value. The values must be sorted in this mode (indeed, your test data fits this latter condition).

Again: note that it is *expected* that when there's no exact match, the last value no greater than lookup value is returned. This is not a bug, it is expected for range lookups like "what tax value was active at 2022-07-04", when the dates column has single dates of tax changes, or "which president was in charge at date X", when dates column contains their inauguration dates.

[1] https://help.libreoffice.org/7.4/en-US/text/scalc/01/04060109.html?DbPAR=CALC#bm_id3153152
Comment 3 zeffur 2022-07-04 15:22:31 UTC
Hello Mike, 

Thank you for your response to my bug report.  I have to tell you that I think the current implementation of the VLOOKUP function is not good.
Here's how I think it should work--just for your information:

When vlookup does not find a matching record in a sorted array, it should NOT return the wrong value by default.  It should return "Not found"  or "N/A" or some other user definable optional description that allows the results to shows that the lookup could not be found.  That should be the default result of the function in that case.  The user wouldn't know/think s/he should have to include an additional argument in the function syntax, such as the current function description shows.  The default result of the function should match what makes common sense.  When no match is found, the function should return that status "Not found" as the default result--not return info from the previous record when the function failed to search further in an ascending-sorted data array.

I also noticed that the function treats "" & 0 as both TRUE & returns "0".  That also doesn't make sense to me.  It should return exactly what is within the cell--either a null or a 0, NOT a 0 for both cases.  I want to know from my results whether or not the found data is really a zero or missing data within the data range.

What can you do to change the function so that it returns the common sense results that one should normally expect to get?

Thank you for your time & consideration.

zeffur
Comment 4 Mike Kaganski 2022-07-04 15:33:04 UTC
Any ideas how it would be better are irrelevant. The function is defined in the ODF standard [1]; and the definition there was not arbitrary, but it followed the long-established rules (coming from Lotus 1-2-3, or maybe even VisiCalc). Every spreadsheet software implements VLOOKUP that way; and doing otherwise would break infinite number of existing documents.

So this is NOTABUG, and this must not change.

[1] https://docs.oasis-open.org/office/OpenDocument/v1.3/OpenDocument-v1.3-part4-formula.html#VLOOKUP
Comment 5 Mike Kaganski 2022-07-04 15:56:39 UTC
FYI: MS Excel, obviously, also saw VLOOKUP syntax not optimal; that's why, when designing their XLOOKUP [1], they made exact match the default.

XLOOKUP is not in Calc yet (tdf#127293). Hopefully you would enjoy it when it arrives.

Note that the "treats "" & 0" is not specific to the discussed function, and the implicit conversion of data is also part of the standard. It will also not change. Spreadsheets have very long history, and their use has lots of inherited stuff, created to try to fit the widest audience possible, with completely different background and expectations.

[1] https://support.office.com/en-us/article/xlookup-function-b7fd680e-6d10-43e6-84f9-88eae8bf5929
Comment 6 Mike Kaganski 2022-12-28 20:55:56 UTC

*** This bug has been marked as a duplicate of bug 141989 ***