Bug 92054 - VLOOKUP may return incorrect values if searched column has similar items
Summary: VLOOKUP may return incorrect values if searched column has similar items
Status: RESOLVED NOTABUG
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
4.3.7.2 release
Hardware: All Windows (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-06-13 22:50 UTC by wrigh
Modified: 2015-06-14 01:32 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
Examples of VLOOKUP returning incorrect values (52.54 KB, application/vnd.oasis.opendocument.spreadsheet)
2015-06-13 22:50 UTC, wrigh
Details

Note You need to log in before you can comment on or make changes to this bug.
Description wrigh 2015-06-13 22:50:10 UTC
Created attachment 116507 [details]
Examples of VLOOKUP returning incorrect values

If the values in the array to be searched are sufficiently similar, then the value found by VLOOKUP may be wrong.
Different errors occur according to whether the table is treated as sorted or unsorted.
===================================
===================================

The attached table consists of a sorted table of 83 codes in columns G & H  (named CTab), starting at row 4. Column H merely holds the row number for identification.

Using row 4 as an example, G4 holds "B11WZ"  H4 holds "4"

Column A contains  an exact copy of Col G   from row 4.  (A4 is  "B11WZ")
============================
Column B  searches for the value from Col A in CTab, and returns (if correct) the value from Col 2 of CTab, i.e. Col H.  The sort order is defined as "not sorted")        

    Example   B4 cell  reads    "=VLOOKUP(A4,CTab,2,0)"
     In this case, cell B4 holds "4"   - which is correct. 
    (Not all values in this column are correct! - see below)

=============================
Column C  searches for the value from Col A in CTab, and returns (if correct) the value from Col 2 of CTab, i.e. Col H. The sort order is defined as "sorted" 

    Example   B4 cell  reads    "=VLOOKUP(A4,CTab,2,1)"
     In this case, cell C4 holds "6"   - which is NOT correct. 
    (Not all values in this column are correct! - see below)
=================================
Columns above G try different cases to show the same faults, and use VLOOKUP to return the search value, either using CTab or a column
Example 1: K4 formula is "VLOOKUP(G4,G$4:G$86,1,0)" 
           G4 value is "B11Wz"  so K4 should also be "B11Wz"   It is.
Example 2: N4 formula is "VLOOKUP(G4,G$4:G$86,1,1)" 
           G4 value is "B11Wz"  so K4 should also be "B11Wz"   It is not.
                    In this case the value of G6 ("B11Wz135") is returned in K4 
  These results match the results in Columns B,C
=================================
ERRORS
======
Column B.  (Assumes table Not sorted).
A set of 11 codes from  "C21Wz" to "C27YYA3"  give the line values corresponding to 
    "BC21Wz" to "BC27YYA3"   
       Each of the second set is "B" + (Value in first set)

NOTE: There is also a similar set
      "RC21Wz" to "RC27YYA3"    BUT The values found in this case are correct 

----------------
Column C (Assumes table sorted).
   There are 18 errors
I THINK that they are as follows:
  IF there is a sequence of codes where
       The first 4 (or is it 5?) characters are the same
     THEN
       The value returned for the first of the series will be the value for the last. (An error)
   ENDIF

Example: Rows 4,5,6,7 of CTab are
B11Wz     4
B11Wz100  5
B11Wz135  6
B12Cz     7

but Column C reads 
          6
          5
          6
          7
 as in the rule above

=================================

PLEASE NOTE:
A ROUGHLY SIMILAR VERSION OF THIS HAS GONE TO APACHE.
Comment 1 Joel Madero 2015-06-13 22:59:33 UTC
This example is incredibly complex and not entirely useful for diagnostic purposes. Is there any way that you can provide a very simple example? Without one it may take a lot longer to confirm the issue and even longer to fix it (devs don't want to look at xml files that are lots of pages long with lots of extra stuff going on).

Please provide a simple example with no extra formatting (such as colors), and provide the least amount of data possible to demonstrate the issue.

Marking as NEEDINFO - if you are unable to do so set it back to UNCONFIRMED and just know it may take a long time to look at it because the file is so complex.
Comment 2 m_a_riosv 2015-06-14 01:32:36 UTC
The function returns the last that is equal or less than searched value when sort order is defined as sorted ascending.

As there is not enable the option:

Menu/Tools/Options/LibreOffice calc/Calculate - Search criteria = and <> must apply to the whole cells.

the function return the last equal or less containing the searched value, enable the option and results become corrects.

In my experience it's not a very good idea use a sort order in the function when keys for search are strings or unique keys. On the other hand useful.

Please review the help:
https://help.libreoffice.org/index.php?title=4.4/Calc/Spreadsheet_Functions&Language=en-US&System=WIN&Version=4.4#bm_id3152809.

Resolved as not a bug. Please if you are not agree, please reopen it.