Bug 152959 - Find and Replace can't use the lowercase Unicode literal for soft hyphen
Summary: Find and Replace can't use the lowercase Unicode literal for soft hyphen
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
4.0.6.2 release
Hardware: x86-64 (AMD64) All
: medium normal
Assignee: Stéphane Guillou (stragu)
URL:
Whiteboard: target:7.6.0 target:7.5.3
Keywords:
Depends on:
Blocks: Find&Replace-Regex
  Show dependency treegraph
 
Reported: 2023-01-10 10:38 UTC by Jerzy Moruś
Modified: 2023-04-02 16:17 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
example ODT to test (11.77 KB, application/vnd.oasis.opendocument.text)
2023-01-10 13:09 UTC, Stéphane Guillou (stragu)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jerzy Moruś 2023-01-10 10:38:39 UTC
I request that the operation of the Find and Replace operation be investigated, as it does not work well in some cases.

Case: soft-hyphen search.
1. For unknown reasons, the Soft hyphen (U+00ad) search fails when \u00ad is entered as the search string, but is found when \u00AD is entered.
2. Typing the required string as a character using ALT+0173 or inserting it directly from the special character table (SHIFT+CTRL+S) will find it, provided "Regular expressions" is disabled.

Case: national character search.
National characters such as Ż (U+017b), Û (U+00db) and Ú (U+00da) will not be found with \u017b, \u00db or \u00da unless "Diacritic-sensitive" is selected. This option should be irrelevant when searching for a specific hexadecimal code.
Comment 1 Stéphane Guillou (stragu) 2023-01-10 13:05:05 UTC
Thanks Jerzy!

The special handling of soft hyphen search was done for bug 73660.
Michael, as you fixed that one back 2016, do you think the lowercase version should be added, for consistency? For example, a full stop can be found with \u002e and \u002E.

See this comment to see what is accepted:
https://bugs.documentfoundation.org/show_bug.cgi?id=64495#c13

Regarding the issue with the diacritic-sensitive option: probably better to open a dedicated bug report so it is tracked separately, as I am sure there will be more discussion needed.

Tested on:

Version: 7.6.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 894efac210a3871214d95a52c322b0bee40f00ba
CPU threads: 8; OS: Linux 5.15; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Calc: threaded
Comment 2 Stéphane Guillou (stragu) 2023-01-10 13:09:57 UTC
Created attachment 184558 [details]
example ODT to test

(this attachment can also be used for a separate bug report about the accentuated characters, by mentioning "attachment xyz", with the number above)
Comment 3 Stéphane Guillou (stragu) 2023-01-10 13:11:31 UTC
(In reply to Stéphane Guillou (stragu) from comment #1)
> The special handling of soft hyphen search was done for bug 73660.

Apologies, the bug in question is actually bug 64495.
Comment 4 Eike Rathke 2023-01-11 13:48:59 UTC
Problem is likely that the soft-hyphen special treatment matches the notation case-sensitive at
https://opengrok.libreoffice.org/xref/core/sw/source/core/crsr/findtxt.cxx?r=dd90710a#745
Comment 5 Stéphane Guillou (stragu) 2023-01-17 08:49:29 UTC
Submitted a patch: https://gerrit.libreoffice.org/c/core/+/145662

I'm sure there's more elegant ways to deal with accepting both uppercase and lowercase, but because the uppercase prefix "\U00" is not accepted for other characters anyway, I only added the one line for the all-lowercase soft hyphen Unicode literal.
Comment 6 Commit Notification 2023-03-15 10:20:44 UTC
Stéphane Guillou committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/02c352d7fdb01e7b4899cbd3c5d62b81019ddb15

tdf#152959 sw: allow lowercase unicode literal for soft hyphen

It will be available in 7.6.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 7 Commit Notification 2023-03-16 09:16:30 UTC
Stéphane Guillou committed a patch related to this issue.
It has been pushed to "libreoffice-7-5":

https://git.libreoffice.org/core/commit/6b003da43a265a431b2a176e4df637523d10fefb

tdf#152959 sw: allow lowercase unicode literal for soft hyphen

It will be available in 7.5.3.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 8 Stéphane Guillou (stragu) 2023-03-16 16:05:21 UTC
Should be fixed in 7.6.0 and 7.5.2.

Please open another ticket for the diacritic-sensitive option, as I imagine it would be a more controversial change.

Thank you!