Bug 111816 - Cannot find special character if does not know character name but number
Summary: Cannot find special character if does not know character name but number
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: UI (show other bugs)
Version:
(earliest affected)
6.0.0.0.alpha0+
Hardware: All All
: medium enhancement
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Special-Character
  Show dependency treegraph
 
Reported: 2017-08-15 11:58 UTC by Regina Henschel
Modified: 2023-06-06 10:27 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Regina Henschel 2017-08-15 11:58:36 UTC
I use the special character dialog in its state in Version: 6.0.0.0.alpha0+
Build ID: f1a896c71c495bdef5861eb664581507b6b9b5bb
CPU threads: 4; OS: Windows 6.1; UI render: default; 
TinderBox: Win-x86@42, Branch:master, Time: 2017-08-13_07:38:19
Locale: de-DE (de_DE); Calc: group

Situation: I want a double-struck A, I know the code point U+1D538, but I do not know the character name and do not know which font has the character.

My Idea: I enter the code point into the field and scroll the font list.

Problem: If the current font does not has the character and I scroll the font list using the arrow key, then the number is destroyed and therefore a wrong subset is shown; even when I reach a font, which has the character.

The problem does not exists, when you start with a font, which has this character. Then the number sticks and you can find other fonts, which have this character too by scrolling the font list.

Help for testing: The fonts "Cambria Math", "DejaVu Sans", "DejaVu Serif", "FreeSerif", "Linux Libertine G" or "STIX" contain the character. Of cause you can use another character of the Plane 1 too for testing, e.g. U+1F609, an Emoji.
Comment 1 V Stuart Foote 2017-08-15 16:39:32 UTC
This is valid. But I don't think we should dump it onto Akshay's GSOC 2017 plate.

Would need to toggle the search to be within a single font, or some composite of on system fonts.

For an example of how this could be handled, have a look at Andrew West's work on BableMap [1] where he provides mapping for a Single font or for a Composite font. 

Also provides a "Font Coverage..."  that shows which fonts provide glyph for codepoint.

=-ref-=

[1] http://www.babelstone.co.uk/Software/BabelMap.html
Comment 2 Heiko Tietze 2017-08-15 17:28:27 UTC
Very unlikely that you know the hex/decimal value but not within what font you find the character. I'd say WF because search is not the same as quick access, i.e. hex/decimal values.

However, the same happens when search is active and I suggested to disable the value fields in this case.
Comment 3 Regina Henschel 2017-08-15 18:32:01 UTC
(In reply to Heiko Tietze from comment #2)
> Very unlikely that you know the hex/decimal value but not within what font
> you find the character. I'd say WF because search is not the same as quick
> access, i.e. hex/decimal values.

You can get a code point very easily, not only by searching in the unicode charts but by tools like http://shapecatcher.com/. Of cause you can enter the character name (see below), but that is cumbersome compared with entering the number.

It is likely, that you do _not_ know the font. For example write U+232c and toggle it to unicode. I bet, you get a benzene ring by font fall back. Now tell me which font it is and whether you own other fonts which have this character.

(In reply to Heiko Tietze from comment #2) 
> However, the same happens when search is active and I suggested to disable
> the value fields in this case.

If you have entered the character name correctly and so the search is active, then scrolling the font list works, although the field does not show the correct number. It is good, that this works, so please make no changes, which would disable this feature.
Comment 4 Heiko Tietze 2017-08-15 18:44:58 UTC
(In reply to Regina Henschel from comment #3)
> (In reply to Heiko Tietze from comment #2) 
> > However, the same happens when search is active and I suggested to disable
> > the value fields in this case.
> 
> If you have entered the character name correctly and so the search is
> active, then scrolling the font list works, although the field does not show
> the correct number. It is good, that this works, so please make no changes,
> which would disable this feature.

Comment to patch 12 at https://gerrit.libreoffice.org/#/c/40563/ ('happens' -> 'happened')

Shapecatcher returns a huge label with the name (where we do offer a outstanding feature) and a tiny unicode value. Only super nerds care about the unicode id primarily.
Comment 5 V Stuart Foote 2017-08-15 19:57:26 UTC
(In reply to Heiko Tietze from comment #2)
> Very unlikely that you know the hex/decimal value but not within what font
> you find the character. I'd say WF because search is not the same as quick
> access, i.e. hex/decimal values.
> 
> However, the same happens when search is active and I suggested to disable
> the value fields in this case.

@Heiko--I just have to say how ill-considered that statement is--especially as you've already unilaterally had Akshay remove the "Characters:" edit view destroying a functional IME we have had for inputting strings in alternate language scripts. 

And now you are saying that Unicode values are not important for search--and for font selection.  The Special Character dialog _is_ the sames as OS provided quick access--it is _our_ visual interface for providing quick access.

Please give some thought to how we support polyglot users who are very dependent on Unicode and need efficient tools for locating glyphs across font families by Unicode name or Codepoint or Unicode subset.
Comment 6 Heiko Tietze 2017-08-15 20:22:20 UTC
(In reply to V Stuart Foote from comment #5)
> Please give some thought...

No need to argue, I have no data. You say people remember the unicode id, like it was back in DOS times with ASCII 132 for ä, rather than searching by the name 'umlaut' or the correct term 'DIAERESIS'. 
And I'm not saying it never happens. If I'd use unicode 0x228 for the ä, and wheel through the fonts it should work like with a search term, which is the fact until I reach OpenSymbol. Scrolling now through this font, changes the unicode as expected. 
Turning the use case around, Regina's workflow, the font for a char that is not commonly integrated could be found. That means, when the font doesn't contain of this item nothing is selected in the chars table. But the unicode fields have a value. Today the table corresponds to the id field.
To me the benefits of this workflow don't outweigh the drawback of inconsistency.
Comment 7 V Stuart Foote 2017-08-15 20:35:29 UTC
(In reply to Heiko Tietze from comment #6)
> (In reply to V Stuart Foote from comment #5)
> > Please give some thought...
> 
> No need to argue, I have no data. You say people remember the unicode id,
> like it was back in DOS times with ASCII 132 for ä, rather than searching by
> the name 'umlaut' or the correct term 'DIAERESIS'. 
> And I'm not saying it never happens. If I'd use unicode 0x228 for the ä, and
> wheel through the fonts it should work like with a search term, which is the
> fact until I reach OpenSymbol. Scrolling now through this font, changes the
> unicode as expected. 

We support folks with our <Alt>+X toggle (bug 73691) and do not integrate OS provided "deadkeys" IME (bug 71176 or bug 42437) so it is a bit disingenuous to suggest now that we don't steer our polyglots toward Unicode. 

> Turning the use case around, Regina's workflow, the font for a char that is
> not commonly integrated could be found. That means, when the font doesn't
> contain of this item nothing is selected in the chars table. But the unicode
> fields have a value. Today the table corresponds to the id field.
> To me the benefits of this workflow don't outweigh the drawback of
> inconsistency.

And the other side of the design case is that we have collapsed empty cells in the Unicode table representation. By doing that we obscure ability to determine the font is missing glyphs/graphemes for Codepoints within a Unicode block.

It makes for concise chart, but destroys one of the strengths of Unicode of being able to select a front from drop list to determine if the font has coverage of the needed glyphs by looking at its table in 15 col hex.

After the GSOC '17 is finished -- I'd move to see the additional support of Unicode based 15 column charts organized/showing HEX value and composite font searches.
Comment 8 Adolfo Jayme Barrientos 2017-08-15 21:33:22 UTC
> You say people remember the unicode id

For the record, I do remember Unicode code point numbers, because I’m a typesetter and frequently need to insert special characters, which is very easy to do with your keyboard in Linux: Ctrl+Shift+U plus the Unicode number (e.g., 2026 for …). It shows on my LibreOffice UI translations, which at the microtypography level are very polished. I realize I’m a “power user”, but we shouldn’t just design for the lowest-common-denominator use case. To me, it’s sensible that if the Special Characters dialog has a search field, it should allow me to enter a Unicode number and be capable to find the character.
Comment 9 Yousuf Philips (jay) (retired) 2017-08-16 00:15:27 UTC
I would assume there are advanced user who do remember unicode numbers, though believe its less likely they don't remember which font its available in, especially if they use them regularly.

Presently we have 3 input field to find a character - the search field for basic users and hex and dec fields for advanced user - and the simplest solution might be to disable input in the hex and dec fields and only have search in the search field. With filtering by hex and dec values in the search field, a user can easily move through the font name list and find which font has that character, similar to how they can do so presently by unicode name.
Comment 10 QA Administrators 2018-08-17 02:38:01 UTC Comment hidden (obsolete)
Comment 11 V Stuart Foote 2018-08-17 16:08:39 UTC
This remains a valid issue with the Special Character dialog. It need design and dev work.
Comment 12 QA Administrators 2019-08-19 06:58:51 UTC Comment hidden (obsolete)
Comment 13 V Stuart Foote 2019-08-19 11:42:11 UTC
This remains a valid issue with the Special Character dialog. It needs design and dev work.
Comment 14 Samuel Mehrbrodt (allotropia) 2019-08-19 11:53:21 UTC
Pro tip: set Importance to "enhancement" and voilà, "QA Administrators" will stop nagging )
Comment 15 V Stuart Foote 2019-08-19 12:12:21 UTC
Well sure, but I'd argue there were multiple implementation issues when the dialog was reworked under GSOC mentorship.

Issue of OP remains, the SMP codepoint value is truncated allowing search only for a BMP glyph.
Comment 16 Heiko Tietze 2023-06-06 10:27:07 UTC
(In reply to V Stuart Foote from comment #11)
> This remains a valid issue with the Special Character dialog. It need design
> and dev work.

The only reasonable way to search for a code point is to enter it in the search field. Using the hex/dec information of the selected item to search for something would be error-prone and hard to understand. 

We have to detect "U+<number>" and switch the search respectively. No further modification to the UI needed.

I could also imagine a filter for the font list that removes fonts that do not contain the code point or the name entered at search.