Bug 140031 - Find & Replace can't tell difference between straight and curly apostrophes (recent regression) WORKAROUND: enable "Regular expressions"
Summary: Find & Replace can't tell difference between straight and curly apostrophes (...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
7.0.4.2 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: bibisected, bisected, regression
Depends on:
Blocks: Find-Search
  Show dependency treegraph
 
Reported: 2021-01-31 10:48 UTC by R. Green
Modified: 2022-01-31 10:15 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:
Regression By:


Attachments
Writer file showing find apostrophe issue (8.94 KB, application/vnd.oasis.opendocument.text)
2021-01-31 10:48 UTC, R. Green
Details

Note You need to log in before you can comment on or make changes to this bug.
Description R. Green 2021-01-31 10:48:05 UTC
Created attachment 169319 [details]
Writer file showing find apostrophe issue

Version: 7.0.4.2 (x64)
Build ID: dcf040e67528d9187c66b2379df5ea4407429775
CPU threads: 4; OS: Windows 10.0 Build 18363; UI render: Skia/Raster; VCL: win
Locale: en-GB (en_GB); UI: en-GB
Calc: threaded

A bad regression which seems to have arisen very recently:

1. Open the attached Writer file.
2. Open the Find & replace dialog. In the Find box enter a straight apostrophe and repeatedly press "Find Next".

EXPECTED RESULT: The search only highlights straight apostrophes.
ACTUAL RESULT: The search highlights all apostrophes—straight or curly.

The search still works, as expected, for straight and curly (double) quotation marks. Same problem with 7.0.4.1.
Comment 1 R. Green 2021-01-31 11:20:09 UTC
Version: 7.0.0.3 (x64)
Build ID: 8061b3e9204bef6b321a21033174034a5e2ea88e
CPU threads: 4; OS: Windows 10.0 Build 18363; UI render: Skia/Raster; VCL: win
Locale: en-GB (en_GB); UI: en-GB
Calc: threaded

Worked OK in this version. 

For another recent regression in F&R, see Bug 140007.
Comment 2 BogdanB 2021-01-31 13:55:17 UTC Comment hidden (obsolete)
Comment 3 [REDACTED] 2021-01-31 15:37:30 UTC
Repro

Version: 7.1.0.3 / LibreOffice Community
Build ID: f6099ecf3d29644b5008cc8f48f42f4a40986e4c
CPU threads: 8; OS: Linux 5.3; UI render: default; VCL: kf5
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded

Note: 
Finds U+0027 (') and U+2019 (’), while searching for ' (U+0027) only.
Comment 4 V Stuart Foote 2021-01-31 16:04:04 UTC
Confirming the regression. Find bar or F&R dialog search of ASCII Apostrophe (U+0027) is now also returning Right single quotation (U+2019).

Simple workaround in F&R dialog of setting the 'Regular expressions' checkbox. 

Possibly an ICU libs change?

It was OK at 6.4.6.2. build, is present

Version: 7.0.4.2 (x64)
Build ID: dcf040e67528d9187c66b2379df5ea4407429775
CPU threads: 4; OS: Windows 10.0 Build 19042; UI render: Skia/Vulkan; VCL: win
Locale: en-US (en_US); UI: en-US
Calc: threaded

and recent master

Version: 7.2.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: 66013201749df7d5ac5ddaf377a7b3732518a93b
CPU threads: 4; OS: Windows 10.0 Build 19042; UI render: Skia/Raster; VCL: win
Locale: en-US (en_US); UI: en-US
Calc: threaded
Comment 5 BogdanB 2021-01-31 17:42:12 UTC
 994b9ddc2fb0fd582f982351b16094748b1fb562 is the first bad commit
commit 994b9ddc2fb0fd582f982351b16094748b1fb562
Author: Jenkins Build User <tdf@pollux.tdf>
Date:   Tue Nov 17 10:17:04 2020 +0100

    source 853c9199444cf893583992bede981c494da21ceb
    
    source 853c9199444cf893583992bede981c494da21ceb

 instdir/program/libi18nsearchlo.so | Bin 104912 -> 105032 bytes
 instdir/program/setuprc            |   2 +-
 instdir/program/versionrc          |   2 +-
 3 files changed, 2 insertions(+), 2 deletions(-)
Comment 6 BogdanB 2021-01-31 17:43:48 UTC
Nemeth, I add you here.
Comment 7 V Stuart Foote 2021-01-31 18:25:48 UTC
László had commented on https://gerrit.libreoffice.org/c/core/+/105717

"Note: as a more sophisticated solution, it's possible to
add a new default transliteration option for this later."

Perhaps that is necessary here?
Comment 8 László Németh 2022-01-31 10:15:01 UTC
As a workaround, enable "Regular expressions" in the Search & Replace dialog window.

Likely it's worth to extend the help or add an option for it, or reuse option "diacritic-sensitive". Compatibility analysis is welcomed, too.

More information about the change:

tdf#117643 Writer: fix apostrophe search regression

During text search, ASCII apostrophe ' (U+0027)
of the search term matches the typographic
apostrophe ’ (U+2019) of the text, too.

There was a UX regression in document editing from
commit e6fade1ce133039d28369751b77ac8faff6e40cb
(tdf#38395 enable smart apostrophe replacement by default),
because Find and Replace window and Find toolbar
doesn't replace ASCII apostrophe, so the search term
hadn't matched the text (now with the automatically
replaced typographic apostrophes), as before the commit.

Regex search hasn't been modified, i.e. searching U+2019
is still necessary a search term with U+2019.

The typographic apostrophes of a search term only match
ASCII apostrophes of the text, if the search term contain
also an ASCII apostrophe, too.

Note: as a more sophisticated solution, it's possible to
add a new default transliteration option for this later.