Bug 104889 - Typing quotation marks into find toolbar fails to find smart quotes
Summary: Typing quotation marks into find toolbar fails to find smart quotes
Status: RESOLVED WONTFIX
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
5.2.3.3 release
Hardware: All All
: medium enhancement
Assignee: Not Assigned
URL: http://www.unicode.org/reports/tr44/#...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-12-23 17:34 UTC by hexafraction
Modified: 2016-12-23 21:00 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description hexafraction 2016-12-23 17:34:12 UTC
Description:
When Writer corrects normal straight quotation marks to "smart quotes" (and apostrophes/single quotes) they cannot be found using Ctrl+F without copy-pasting them into the find toolbar.

Steps to Reproduce:
1. Type a document that contains smart quotes.
2. Press Ctrl+F to open find toolbar
3. Type a phrase that exists within the document and contains a smart quote

Actual Results:  
No results are found.

Expected Results:
The result is found (possibly conditional on a configuration setting).


Reproducible: Always

User Profile Reset: No

Additional Info:


User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.100 Safari/537.36 Vivaldi/1.5.658.56
Comment 1 hexafraction 2016-12-23 18:48:54 UTC
Note: The version isn't the earliest version where this can be reproduced; I believe it is the case for all versions that support automatic smart quotes, but I don't have a fast enough unmetered connection to bisect to the first version affected.
Comment 2 V Stuart Foote 2016-12-23 19:39:27 UTC
Expected behavior of the Find Bar.

Also, our regular expression implementation (ICU 58 [1]) makes this trivial, but requires use of the <Ctrl>+H Find & Replace dialog rather than the <Ctrl>+F Find Bar.

Yes it would be possible to provide a checkbox to enable regular expression parsing in the Find Bar. But we have discussed this in different contexts (e.g. bug 102615) and have elected to keep the Find Bar the light weight tool. I don't see a use case for doing this, especially as the Find & Replace widget is kept on the Find Bar.

=-Usage notes-=
How to use the ICU regular expressions in Find & Replace for smart quotes:

Enter this query in the Find field with regular expressions enabled.

for Initial quotes:
\p{Pi}

for Final quotes:
\p{Pf}

for Other Punctuation:
\p{Po}

or
for Any Punctuation
\p{P}

Alternatively, again using the ICU regular expressions the "smart quotations" are in the Unicode General Punctuation block. And can be searched with the "\N{UNICODE CHARACTER NAME}" syntax, or with the "\uhhhh" or "\x{hhhh}" syntax.

But the specific auto correct replacement can be set to any Unicode point glyph as preferred with the  Tools -> AutoCorrect -> AutoCorrect Options: Localized Options tab. Including an ability to not replace.

Common AutoCorrect values used for Single Quote/Double Quotes
U+2018 -- ‘ (LEFT SINGLE QUOTATION MARK)
U+2019 -- ’ (RIGHT SINGLE QUOTATION MARK)
U+201a -- ‚ (SINGLE LOW-9 QUOTATION MARK)
U+201b -- ‛ (SINGLE HIGH-REVERSED-9 QUOTATION MARK
U+201c -- “ (LEFT DOUBLE QUOTATION MARK)
U+201d -- ” (RIGHT DOUBLE QUOTATION MARK)
U+201e -- „ (DOUBLE LOW-9 QUOTATION MARK)
U+201f -- ‟ (DOUBLE HIGH-REVERSED-9 QUOTATION MARK)

Obviously a codepoint and glyph from out of the General Punctuation block would not be found with the \p{P*} syntax.

=-ref-=

[1] http://userguide.icu-project.org/strings/regexp
Comment 3 hexafraction 2016-12-23 19:50:24 UTC
Thank you for clarifying. However, isn't this expected behavior at the expense of end-user usability? If a document contains, for example, multiple instances of "Susan" and a few of "Susan's" that were typed by striking the apostrophe key and allowing automatic smart quotes to be created, then performing a search for the latter via the find bar, although not equivalent in the Unicode representation of the document, would be expected to yield those results--In my personal opinion smart-quotes should be transparent to the person ultimately composing the document and using the software (or the quotes could get auto-corrected when typing into the find bar).


The fact that other replacements for smart quotes are possible is a good point though; I don't have any good idea as to how to handle those replacements, and I of course do not have the full understanding of the design decisions made behind both the find bar and find/replace tools and the smart quotes behavior.
Comment 4 V Stuart Foote 2016-12-23 21:00:09 UTC
(In reply to hexafraction from comment #3)
> Thank you for clarifying. However, isn't this expected behavior at the
> expense of end-user usability? 

No.

> ... then
> performing a search for the latter via the find bar, although not equivalent
> in the Unicode representation of the document, would be expected to yield
> those results--

But that is the point, the Find Bar "as implemented" does not apply the auto correction to the entered search string--no search capability does (nor should we expect it to). Rather what is typed is what is searched for, and the original opening/closing single or double quote have already been replaced in the text and so will not be found. Requiring regular expression based search only available in the _advanced_ "Find & Replace" dialog tool.

> In my personal opinion smart-quotes should be transparent to
> the person ultimately composing the document and using the software (or the
> quotes could get auto-corrected when typing into the find bar).
> 

By that logic why even have the _simple_ "Find Bar"--eliminate it for _advanced_ function of the Find & Replace dialog--and force folks to learn the more capable tool?  

No, folks want a simple search capability--what's typed is what's found.

And those that need to do more complex things demand something else. Finding auto-corrected text is not a _simple_ search, so requires the later. Different requirements so different tools.