Bug 100480 - Improve the description of the "Match case" check box in the Find and Replace dialog
Summary: Improve the description of the "Match case" check box in the Find and Replace...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Documentation (show other bugs)
Version:
(earliest affected)
5.1.3.2 release
Hardware: x86-64 (AMD64) Windows (All)
: medium enhancement
Assignee: Rafael Lima
URL:
Whiteboard: target:7.3.0
Keywords:
Depends on:
Blocks: Find&Replace-Dialog
  Show dependency treegraph
 
Reported: 2016-06-19 13:39 UTC by Matthias Basler
Modified: 2021-08-25 10:19 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Matthias Basler 2016-06-19 13:39:52 UTC
User story
As someone working with gothic fonts I want to be able to search for *exactly* "long s" (letter ſ) without finding "s" and "ß" in order to be able to quickly find wrong occurrences of this letter, e.g. at the end of a word.

How to reproduce:
1. Enter following text in an empty document:
   "Ich ſelbſt kann daſ Lied, das ich mit Maß ſinge."

Note that there are several "long s", as used in old German texts. Be sure you have an OpenType font (in order to support this letter, e.g. "Unifraktur Maguntia" font.)

2. Try to automatically replace wrong occurrences of "ſ" at the end of the word. Try to enter following into the Search/replace dialog:
Find: "ſ "
Replace: "s "
You may then set the search to be case sensitive.

Expected Result:
- The search finds (and replaces) "ſ " in the fourth word, no more.

Actual Result:
- The search finds "ſ ", "s " and even "ß " and would, if automatically executed, replace all "ß " with "s ", which is not intended.

I have pondered the help but found no indication that this "automagical" detection of similar letters can be turned off. Using regular expressions does not help.
Comment 1 V Stuart Foote 2016-06-19 17:56:55 UTC
One Windows 10 Pro 64-bit en-US with
Version: 5.1.4.2 (x64)
Build ID: f99d75f39f1c57ebdd7ffc5f42867c12031db97a
CPU Threads: 8; OS Version: Windows 6.19; UI Render: GL; 
Locale: en-US (en_US)

Works for me with both U+017f<space> and "Match Case" checked, as well with regex "Regular Expresssion" and "Match Case" checked active and search for regex "\u017f\b"

https://help.libreoffice.org/Common/List_of_Regular_Expressions
Comment 2 Matthias Basler 2016-06-19 21:03:30 UTC
Switching on "Match case" seems indeed to have the desired effect.
(Using regex or not does not make any difference.)

It would be good if this fact gets documented on the corresponding help page since imho the current behaviour is not obvious to users. Should we leave this ticket open, possibly rename it?
Comment 3 V Stuart Foote 2016-06-19 23:08:14 UTC
The regex "\b" allows you to match at the end of a word, including the last word of a paragraph. What you asked for.

The "Match case":
"Distinguishes between uppercase and lowercase characters."

So, yes there some strange logic regards the lower case Long S ( ſ U+017f), the Small Letter S ( s U+0073) the Capical Letter S ( S U+0053), the Sharp S ( ß U+00df) and its upper case varient ( ẞ U+1e9e).  But checking "Match case" allows you to identify the specific glyph you need without that interfeering.

What would you change the help to read, that would be more informative?


=-ref-=
http://opengrok.libreoffice.org/xref/help/source/text/shared/01/02100000.xhp#98
Comment 4 Matthias Basler 2016-06-20 16:50:08 UTC
On the page "Suchen & Ersetzen" (engl. "Search & Replace") I suggest:

____________________
Groß-/Kleinschreibung
---------------------
Unterscheidet z wischen Groß- und Kleinbuchstaben. Ist diese Option angehakt, wird auch nicht nachbestimmten Varianten von Buchstaben gesucht (etwa s, ß und langes s wenn "s" eingegeben wurde), sondern nur nach den exakten Buchstaben. 
____________________

Suggested english text:
____________________
.... If enabled this also disables the search for certain letter variants (such as s, ß, and long s when "s" is entred) .
____________________

Note that I have the German version of the help, so if the English differs and already contains this hint, just ignore my remark.

It is interesting to see that other letter variants like ŝ or š are not found, so I wonder if there is a documentation of letters considered "equivalent" somewhere?

P.S. Yes, your idea with regex "\b" is good. I had overlooked this one.
Comment 5 Urmas 2016-06-22 12:01:36 UTC
S is uppercase of ſ, and SS of ß, so there is no way to distinguish them except by case.
Comment 6 Rafael Lima 2021-08-24 13:53:03 UTC
I believe this is related to how Unicode deals with case mapping:
https://unicode.org/faq/casemap_charprop.html#3

The case of sharp S is defined in:
https://www.unicode.org/Public/UCD/latest/ucd/SpecialCasing.txt

The most likely explanation is that "Match case" does not use case matching while searching for "s". If the option is disabled, all case matches are considered.

The current help page for LO 7.3 has not been updated yet.
https://help.libreoffice.org/7.3/en-US/text/shared/01/02100000.html
Comment 7 Commit Notification 2021-08-25 05:36:46 UTC
Rafael Lima committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/help/commit/f801beeaf39c7e7f018b655f28ba8c215ae14763

tdf#100480 Clarify the use of "Match case"