Bug 139612 - Option "Search criteria = and <> must apply to whole cells" doesn't affect to calculation as must.
Summary: Option "Search criteria = and <> must apply to whole cells" doesn't affect to...
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
4.3 all versions
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:7.3.0 target:7.2.5 inReleaseNotes
Keywords: bibisectRequest, regression
: 138532 146657 (view as bug list)
Depends on:
Blocks: Calculate Options-Dialog-Calc
  Show dependency treegraph
 
Reported: 2021-01-14 17:23 UTC by Roman Kuznetsov
Modified: 2022-06-25 08:03 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
File example (156.86 KB, application/vnd.oasis.opendocument.spreadsheet)
2021-01-14 17:24 UTC, Roman Kuznetsov
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Roman Kuznetsov 2021-01-14 17:23:31 UTC
Description:
Option "Search criteria = and <> must apply to whole cells" from https://help.libreoffice.org/7.1/en-US/text/shared/optionen/01060500.html doesn't affect to calculation as must.

Steps to Reproduce:
1. Open file from attach
2. Look at red B16 cell on Sheet3
3. It contains 0 instead 1

Actual Results:
Option "Search criteria = and <> must apply to whole cells" doesn't affect to calculation as must.

Expected Results:
Option "Search criteria = and <> must apply to whole cells" affects to calculation as must.


Reproducible: Always


User Profile Reset: No



Additional Info:
Version: 7.2.0.0.alpha0+ (x64)
Build ID: 94f6765d6ecc3145fa2d266231124003cf953118
CPU threads: 4; OS: Windows 6.1 Service Pack 1 Build 7601; UI render: Skia/Raster; VCL: win
Locale: ru-RU (ru_RU); UI: ru-RU
Calc: CL

repro also in 4.3 but not in 4.2 => regression
Comment 1 Roman Kuznetsov 2021-01-14 17:24:09 UTC
Created attachment 168885 [details]
File example
Comment 2 Mike Kaganski 2021-01-14 17:34:18 UTC
Repro with Version: 7.1.0.1 (x64)
Build ID: b585d7d90ab863bf29b2d110c174c0c2a98f3ee4
CPU threads: 12; OS: Windows 10.0 Build 19042; UI render: Skia/Raster; VCL: win
Locale: ru-RU (ru_RU); UI: en-US
Calc: CL

Affected are all functions that are described in ODF [1] to depend on HOST-SEARCH-CRITERIA-MUST-APPLY-TO-WHOLE-CELL host-defined property, or to take parameters of type Criterion or Criteria:

Database Functions
AVERAGEIF
AVERAGEIFS
COUNTIF
COUNTIFS
HLOOKUP
LOOKUP
MATCH
VLOOKUP
SUMIF
SUMIFS

Seems that Advanced Filter is affected, too.

[1] http://docs.oasis-open.org/office/OpenDocument/v1.3/OpenDocument-v1.3-part4-formula.html
Comment 3 Igor 2021-01-14 19:39:22 UTC
Advanced or Standard filters work correctly. The database functions and a number of others mentioned above ones ignore the disabled option "Search criteria = and <> must apply to whole cells" until the search criterion is identified using any regex identifier, for example, a pair of parentheses.
In other words, the criterion should not be simple, otherwise, even if the option "Search criteria = and <> must apply to whole cells" is disabled, it is perceived as requiring an exact match.
Comment 4 Igor 2021-01-14 19:48:49 UTC
Will find "red" in the line "Fred" if you specify such a criterion:: "r.d", "red.*", "(red)" or even "()red". But it does not work: "red", because it is perceived as an exact match, contrary to the meaning of the option.
Comment 5 Mike Kaganski 2021-01-15 06:55:45 UTC
(In reply to Igor from comment #3)
> Advanced or Standard filters work correctly.

Please clarify when you assert something; at least provide the version you test, and also listing the steps you tried would be good.

Specifically, with the version mentioned in comment 1, doing this:

1. Select A1:E10
2. Data->More Filters->Advanced Filter
3. Read Filter Criteria From -> Лист3!$A$13:$E$14 -> OK

results in rows 2 to 10 all hidden, i.e. it filters out all rows. However, this works (keeps row 3 shown) in Version: 5.0.0.5 (x64)
Build ID: 1b1a90865e348b492231e1c451437d7a15bb262b
Locale: ru-RU (ru_RU), which means there's a way to bibisect this specific part of the problem (and possibly find the relevant code).
Comment 6 Mike Kaganski 2021-01-15 07:10:14 UTC
Steps from comment 5 work OK (i.e., keep row 3 shown) in Version: 6.0.0.3 (x64)
Build ID: 64a0f66915f38c6217de274f0aa8e15618924765
CPU threads: 12; OS: Windows 10.0; UI render: GL; 
Locale: ru-RU (ru_RU); Calc: CL

but doesn't work in Version: 6.1.0.3 (x64)
Build ID: efb621ed25068d70781dc026f7e9c5187a4decd1
CPU threads: 12; OS: Windows 10.0; UI render: GL; 
Locale: ru-RU (ru_RU); Calc: CL

Bibisection in the range would be useful. It doesn't mean that the bug with spreadsheet functions starts in v.6.0; strictly speaking, comment 5 describes a separate (related, but different) bug. Bibisection of the filter breakage in the 6.1 range would still benefit this.
Comment 7 Mike Kaganski 2021-01-15 07:30:22 UTC
(In reply to Mike Kaganski from comment #6)

Bibisected to https://git.libreoffice.org/core/+/a953fa1c0f6a40a08859570516c511f3a8410a35
> author Michael Meeks <michael.meeks@collabora.com> Sun Apr 08 00:24:15 2018 +0100
> committer Michael Meeks <michael.meeks@collabora.com> Tue Apr 10 22:28:37 2018 +0200
> 
> vlookup - optimize SC_EQUAL and NOT_EQUAL.
> 
> Also don't accept partial matches ie. CONTAINS != EQUAL,
> for VLOOKUP even if document option "search criteria =, <>
> for whole cells" is turned off.
> 
> This also adds a new spreadsheet test file vlookup2.fods
> with the option "search criteria =,<> for whole cells" turned off,
> with VLOOKUP test cases that ensures that partial matches are not
> accepted.
Comment 8 Mike Kaganski 2021-01-15 07:42:53 UTC
Michael,
the commit identified in comment 7 looks strange, directly contradicting ODF standard [1]. Could you please describe what was the rationale? (The commit message does not explain whys, only what was done.) Also the commit doesn't tell it intends to affect Advanced Filter - is it an unwanted regression?

OTOH, the standard only says "The values returned *may* vary depending upon the ... HOST-SEARCH-CRITERIA-MUST-APPLY-TO-WHOLE-CELL" (emphasis on "may"). Also VLOOKUP is not documented to take "Criterion", only "Any" (which means that Criterion and Criteria are among possible values?). And I don't see a description of the filter in the standard (it might be there though, because filtering allows to store the criteria in the file; I just didn't spend time on investigating it). Eike: do you have an opinion on this?

[1] http://docs.oasis-open.org/office/OpenDocument/v1.3/OpenDocument-v1.3-part4-formula.html#__RefHeading__1018436_715980110
Comment 9 Michael Meeks 2021-01-15 11:00:05 UTC
Very little time to look at this. Looks like I created a unit test sheet for this behavior, as such I imagine that when I created it we spent a chunk of time analyzing what the right behavior is - particularly as regards interoperability.

This code is at/near the center of many inner-loops and is intensely performance critical:

commit afa91e423c2073cec281477f2154291c6d4f739d
Author: Dennis Francis <dennis.francis@collabora.co.uk>
Date:   Sat Apr 7 17:13:42 2018 +0530

    avoid SharedString copy assignment

commit 76bc1a81d089d9f66639c118c4063d87e4422684
Author: Michael Meeks <michael.meeks@collabora.com>
Date:   Thu Apr 5 16:15:32 2018 +0100

    tdf#115490 - avoid transliteration by using SharedString.
    
    Do this only for case insensitive matching for now; optimizing vlookup
    nicely - for me saves 35% of the compute time for 10k rows.

commit 009a326d78fb62a80f9631844af324d0294710b6
Author: Michael Meeks <michael.meeks@collabora.com>
Date:   Thu Apr 5 14:01:05 2018 +0100

    query entry - preparatory pure re-factor.
    
    No need to check lengths for ENDS_WITH - we get the right offsets
    to start with before indexOf, and separate nStrPos for later.


All of these are related. Carefully using the SharedString to do the comparisons was a huge win where we can do that for case-insensitive comparison.

What is the correct behavior: I'd look at interoperability first, common sense next, and then the ODF spec. Someone needs to do that analysis though I'm afraid I don't have time.
Comment 10 Luboš Luňák 2021-11-15 13:51:23 UTC
The commit in comment #7 indeed seems to make the option irrelevant (at least as long as the search is not a regexp). No time to have a look at this more either though.
Comment 11 Commit Notification 2021-11-15 16:57:33 UTC
Luboš Luňák committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/b90a5d0f23ffdf1b43c752db424c65bec6f0e8f1

Revert "improve performance of cell equality comparisons)" (tdf#139612)

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 12 Luboš Luňák 2021-11-16 00:16:54 UTC
So I actually ended up having a look at this, and this is indeed a bug. I don't find a good rationale for that commit. It matters only when matching whole cells is off, but the default is on, and has been for ages. MS Office defaults to on as well. Whoever wants good VLOOKUP speed just needs not to shoot himself in the foot by disabling the option.

So as far as I can judge interoperability, common sense and ODF spec all say the commit doesn't make sense.
Comment 13 Commit Notification 2021-11-17 09:11:36 UTC
Luboš Luňák committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/8dec2a98ce29251936cd45ebf864a89ff767ee50

revert "vlookup - optimize SC_EQUAL and NOT_EQUAL." (tdf#139612)

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 14 Xisco Faulí 2021-11-18 08:48:29 UTC
i do confirm the issue is fixed in

Version: 7.3.0.0.alpha1+ / LibreOffice Community
Build ID: 4180b2f7855479f9187cd259d57c06e9fedfb802
CPU threads: 4; OS: Linux 5.10; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded

@Luboš Luňák, thanks for fixing this issue!!
Comment 15 Commit Notification 2021-11-18 08:50:10 UTC
Luboš Luňák committed a patch related to this issue.
It has been pushed to "libreoffice-7-2":

https://git.libreoffice.org/core/commit/4967ac966be3bc84e5db210f6d5d96d45c877db7

revert "vlookup - optimize SC_EQUAL and NOT_EQUAL." (tdf#139612)

It will be available in 7.2.4.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 16 Commit Notification 2021-11-18 10:29:36 UTC
Xisco Fauli committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/258dc1f332155cd565d34b1ce3b51edc3c2fbf64

tdf#139612: sc_subsequent_filters_test2: Add unittest

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 17 Christian Lohmaier 2021-12-06 13:28:48 UTC
7.2.4 was a hotfix release, updating target in status-whiteboard
Comment 18 Stéphane Guillou (stragu) 2022-01-01 13:52:39 UTC
Reviewing 7.3 release notes.

Verified in:

Version: 7.3.0.1 / LibreOffice Community
Build ID: 840fe2f57ae5ad80d62bfa6e25550cb10ddabd1d
CPU threads: 4; OS: Linux 5.4; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Calc: threaded

Having the option on or off does make a difference in cell B16 now, and the advanced filter also behaves differently.
Comment 19 Mike Kaganski 2022-01-09 05:37:08 UTC
*** Bug 146657 has been marked as a duplicate of this bug. ***
Comment 20 Luboš Luňák 2022-06-25 08:03:44 UTC
*** Bug 138532 has been marked as a duplicate of this bug. ***