Description: The Similarity Search dialog has some confusing language. "Exchange characters" does not really mean that characters will be exchanged. The actual meaning is "(the maximum number of) exchangeable characters". https://translations.documentfoundation.org/translate/libo_ui-master/cuimessages/en/?checksum=cc0ffa5f06739be3 "Add characters" does not really mean that characters will be added. The actual meaning is "(the maximum number of) additional characters". https://translations.documentfoundation.org/translate/libo_ui-master/cuimessages/en/?checksum=64413dc15303310b "Remove characters" does not really mean that characters will be removed. The actual meaning is "(the maximum number of) missing characters". https://translations.documentfoundation.org/translate/libo_ui-master/cuimessages/en/?checksum=d0c4cae6b614b3eb See Help for more information: https://help.libreoffice.org/latest/en-US/text/shared/01/02100100.html Steps to Reproduce: 1. In Writer, go to Edit -> Find & Replace 2. Check the Similarity search check box 3. Click the Similarities... button. Actual Results: The strings "Exchance characters", "Add characters" and "Remove characters" are used in the UI. Expected Results: More descriptive strings such as "Exchangeable characters:", "Additional characters:" and "Missing characters:" are used instead. Reproducible: Always User Profile Reset: No Additional Info: n/a
No strong opinions from my side.
Here the use of tooltips is very helpful. But I would change the first one (as well as the description in the help page) For "Exchange characters" the tooltip is "Enter the number of characters in the search term that can be exchanged". Maybe a better tooltip would be "Enter the number of characters that can differ from the search term" Here are some suggestions for the labels: "Number of different characters" "Number of additional characters" "Number of missing characters" Or maybe a reduced version "Different by [ ] characters" " Larger by [ ] characters" " Shorter by [ ] characters" Where [ ] is where the entry box is positioned.
(In reply to Rafael Lima from comment #2) > Here the use of tooltips is very helpful. But I would change the first one > (as well as the description in the help page) > > For "Exchange characters" the tooltip is "Enter the number of characters in > the search term that can be exchanged". Maybe a better tooltip would be > "Enter the number of characters that can differ from the search term" > > Here are some suggestions for the labels: > "Number of different characters" > "Number of additional characters" > "Number of missing characters" I agree that "different" is a better word here. > Or maybe a reduced version > > "Different by [ ] characters" > " Larger by [ ] characters" > " Shorter by [ ] characters" > > Where [ ] is where the entry box is positioned. This would work well for English (and many other languages), but not necessarily for all languages. I think hardcoding a particular sentence structure in the UI should be avoided.
(In reply to Tuomas Hietala from comment #3) > > "Different by [ ] characters" > > This would work well for English (and many other languages), but not > necessarily for all languages. I think hardcoding a particular sentence > structure in the UI should be avoided. Isn't this kind of a <label><value><unit> sequence? It would be easier to refuse this good idea if we had an example where it's not working.
(In reply to Rafael Lima from comment #2) > For "Exchange characters" the tooltip is "Enter the number of characters in > the search term that can be exchanged". Maybe a better tooltip would be > "Enter the number of characters that can differ from the search term" That is not what it does. "ab" differs from "a" by one character but there is no character replaced/exchanged/substituted. There is one deletion if going from "ab" to "a". > Here are some suggestions for the labels: > "Number of different characters" > "Number of additional characters" > "Number of missing characters" > > Or maybe a reduced version > > "Different by [ ] characters" > " Larger by [ ] characters" > " Shorter by [ ] characters" I think that's not any better. It may be hard to describe in three words for each option what it actually does, but "different by" is too vague and does not describe the replacement/substitution parameter; "larger by" sounds as if the matched string may contain x more characters but that is only one possible effect of the parameter; similar for "shorter by". Also, characters are not "missing". For choosing the wording it may be important to know roughly about the Weighted Levenshtein Distance (WLD) algorithm. It looks for a possible transformation of the search term to a text string by measuring an "edit distance". That transformation can be accomplished by different operations, for example "ab" can be transformed to "ac" by either replacing/substituting 'b' with 'c' (distance of 1), or by removing/deleting 'b' and then adding/inserting 'c' (distance of 2). See also https://en.wikipedia.org/wiki/Levenshtein_distance
Levenshtein talks about "Insert/Delete/Replace [n]" characters. We use "Add/Remove/Exchange characters [n]", indeed not much but at least a little improvement.
(In reply to Heiko Tietze from comment #4) > (In reply to Tuomas Hietala from comment #3) > > > "Different by [ ] characters" > > > > This would work well for English (and many other languages), but not > > necessarily for all languages. I think hardcoding a particular sentence > > structure in the UI should be avoided. > > Isn't this kind of a <label><value><unit> sequence? It would be easier to > refuse this good idea if we had an example where it's not working. On a second thought, this actually wouldn't be a problem here, because the structure <string 1>[input box]<string 2> does accommodate any kind of word order, because it's possible to leave either of the strings empty if necessary.
The labels were the same in: Version: 6.3.6.2 Build ID: 2196df99b074d8a661f4036fca8fa0cbfa33a497 CPU threads: 8; OS: Linux 5.15; UI render: default; VCL: gtk3; Locale: en-AU (en_AU.UTF-8); UI-Language: en-US Calc: threaded I would be happy with: - Exchange at most [] characters - Add at most [] characters - Remove at most [] characters (In reply to Eike Rathke from comment #5) > For choosing the wording it may be important to know roughly about the > Weighted Levenshtein Distance (WLD) algorithm. It looks for a possible > transformation of the search term to a text string by measuring an "edit > distance". That transformation can be accomplished by different operations, > for example "ab" can be transformed to "ac" by either replacing/substituting > 'b' with 'c' (distance of 1), or by removing/deleting 'b' and then > adding/inserting 'c' (distance of 2). > See also https://en.wikipedia.org/wiki/Levenshtein_distance In that sense, the documentation could be improved: https://help.libreoffice.org/7.5/en-US/text/shared/01/02100100.html?System=UNIX&DbPAR=WRITER&HID=cui/ui/similaritysearchdialog/grid1#bm_id3154815 Using wording like "how many times a character can be added when computing the edit distance between the search string and the matched string". Related to this, there's bug 129492
We discussed this topic in the design meeting. Basically shorter labels are better than verbose. The idea with treating "characters" as kind of a unit is going in this direction. Whether the ultimate string is "Exchange at most" or "Different by" or just "Add" should be decided by native speakers (or the one who implements it). Personally I prefer the second option. Code pointer: cui/uiconfig/ui/similaritysearchdialog.ui