Bug 167314 - Search Function Fails to Locate Text Containing Word Joiner (U+2060)
Summary: Search Function Fails to Locate Text Containing Word Joiner (U+2060)
Status: UNCONFIRMED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Find-Search
  Show dependency treegraph
 
Reported: 2025-07-01 03:59 UTC by jteera5
Modified: 2025-07-01 19:16 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description jteera5 2025-07-01 03:59:40 UTC
The LibreOffice search function (Find & Replace, Ctrl+F) fails to locate text strings that contain a Word Joiner (U+2060) character embedded within them, even when the "Show Formatting Marks" option is enabled and the character is visibly present.

Steps to Reproduce:
Open LibreOffice Writer.
Type a word, for example, "test".
Place your cursor in the middle of the word (e.g., between 'e' and 's').
Insert a Word Joiner (U+2060) character. This can typically be done via Insert > Formatting Mark > Word Joiner or by typing its Unicode sequence (U+2060 then Alt+X) if your system supports it. You should see a small bar or similar indicator if "Show Formatting Marks" (View > Formatting Marks) is enabled.

Example Resulting Text (with formatting marks visible): t e•s t (where '•' represents the Word Joiner)

Press Ctrl+F to open the Find & Replace dialog.

In the "Find" field, type the full word, "test".

Click "Find Next".

Actual Results:
The search function does not find the word "test" that contains the embedded Word Joiner.

Expected Results:
The search function should find the word "test" even when it contains an embedded Word Joiner, as the Word Joiner is a non-displaying formatting mark intended to control line breaking, not to alter the textual content for search purposes. Users should be able to locate text strings regardless of the presence of such non-semantic formatting characters.

Workaround (if any):
Currently, no effective workaround exists other than manually inspecting text for formatting marks or removing them before searching. Searching for parts of the word (e.g., "te" or "st") might yield results, but this is not a viable solution for finding the complete string.

Severity:
Medium - This issue significantly impacts user experience by making it difficult or impossible to locate specific text within documents, especially in cases where Word Joiners are used for specific formatting requirements (e.g., preventing line breaks in specific compound words or codes).

Additional Notes:
This issue seems to apply specifically to the Word Joiner (U+2060) but may extend to other zero-width formatting characters. It suggests that the search algorithm might be treating these characters as separators or entirely ignoring them in a way that breaks standard string matching.
Comment 1 V Stuart Foote 2025-07-01 18:46:48 UTC
But you *can* search for a string that contains the WJ U+2060 regardless of its position in a word or sentence. 

Why would you expect WJ to be ignored in a search string attempting to find its surrounding text?

Just search for and remove the WJ, and then after editing apply it back if needed?

I didn't trace how we use the WJ (icu lib or as editshell handling), but the Find Bar tb is not the UI component to be adjusting. Rather it should be in Find & Release dialog, maybe worked in with the ignore diacritic handling.
Comment 2 V Stuart Foote 2025-07-01 19:16:58 UTC
make that Find & Replace dialog (<Ctrl>+H)
                 ^^^^^^^