Download it now!
Bug 126294 - Similarity search does not find results when searching multiple words
Summary: Similarity search does not find results when searching multiple words
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
6.2.4.2 release
Hardware: All Windows (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Find&Replace-Dialog
  Show dependency treegraph
 
Reported: 2019-07-09 01:32 UTC by Joel M
Modified: 2020-05-30 15:12 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Joel M 2019-07-09 01:32:16 UTC
Description:
In Writer 6.2.4.2 (x64), the Find & Replace dialog's "similarity search" option fails to find results if you have multiple words or whitespace in the "find" input. It appears to be treating whitespace or extra words like added or exchanged characters.

Steps to Reproduce:
1. Create a new Writer document.
2. Type something. Example: "For example, this document."
3. Open Find & Replace. Search for "document". This works.
4. Check "Similarity search" and set "Exachange characters" to 2. Search again. This works.
5. Search for "docummmt" instead. This works.
6. Search for "this docummmt" or even just " docummmt". This does not work.
7. Increase the add or exchange characters or both. If you increase them enough, this works, but the combination is unclear.

Actual Results:
"Search key not found"

Expected Results:
Highlight 1 result


Reproducible: Always


User Profile Reset: No



Additional Info:
Comment 1 Dieter 2019-07-09 06:58:25 UTC
(In reply to Joel M from comment #0)
> 6. Search for "this docummmt" or even just " docummmt". This does not work.

" docummmt" works for me, but not "this docummmt" using

Version: 6.4.0.0.alpha0+ (x64)
Build ID: ae823e4633a76d13cebc6432b9e44b9b2862326b
CPU threads: 4; OS: Windows 10.0; UI render: GL; VCL: win; 
TinderBox: Win-x86_64@42, Branch:master, Time: 2019-06-26_23:06:07
Locale: de-DE (de_DE); UI-Language: en-US
Calc: threaded

and also in

Version: 6.2.5.2 (x64)
Build-ID: 1ec314fa52f458adc18c4f025c545a4e8b22c159
CPU-Threads: 4; BS: Windows 10.0; UI-Render: Standard; VCL: win; 
Gebietsschema: de-DE (de_DE); UI-Sprache: de-DE
Calc: threaded

Thank you for reporting the bug. To be certain the reported issue is not related to corruption in the user profile, could you please reset your Libreoffice profile (https://wiki.documentfoundation.org/UserProfile) and re-test?

I have set the bug's status to 'NEEDINFO'. Please change it back to 'UNCONFIRMED' if the issue is still present
Comment 2 Joel M 2019-07-18 22:26:08 UTC
I'll try the profile reset thing. Meanwhile, I've reproduced the bug on:

Windows 8
LibreOffice 5.0.2.2

So this is apparently not new. It looks like increasing the remove characters count has a strong effect on whether there is a hit, but you also need to increase one or the other of add or change characters when searching for "this docummmt".

Presumably it's doing something like looking at each word in the document individually but comparing it to the whole search string.
Comment 3 QA Administrators 2019-07-19 02:54:00 UTC Comment hidden (obsolete)
Comment 4 Joel M 2019-07-27 18:19:31 UTC
(In reply to Dieter Praas from comment #1) 
> Thank you for reporting the bug. To be certain the reported issue is not
> related to corruption in the user profile, could you please reset your
> Libreoffice profile (https://wiki.documentfoundation.org/UserProfile) and
> re-test?
> 
> I have set the bug's status to 'NEEDINFO'. Please change it back to
> 'UNCONFIRMED' if the issue is still present

Per the directions you linked, I restarted LO in safe mode (using the default setting) on the Windows 10 / LO 6.2.4.2 (x64) machine. I noticed the toolbars and icons were different, so I think safe mode was properly ignoring my profile.

I followed my steps above and got the same result: similarity search seems to choke when searches involve whitespace. Increasing the "Remove characters" count did result in a match, even though it shouldn't have been necessary. (So my original step #7 may be wrong -- it seems like remove characters is more important than add or exchange for this behavior.)
Comment 5 Dieter 2019-07-28 04:24:04 UTC
Set to NEW. It still works for me with an additional whitespace, but fails with searching for two words (I tried with "this documnt").