111816 – Cannot find special character if does not know character name but number

Bug 111816 - Cannot find special character if does not know character name but number

Summary: Cannot find special character if does not know character name but number

Status:	VERIFIED FIXED

Alias:	None

Product:	LibreOffice
Classification:	Unclassified
Component:	UI (show other bugs)
Version: (earliest affected)	6.0.0.0.alpha0+
Hardware:	All All

Importance:	medium enhancement
Assignee:	Mike Kaganski

URL:
Whiteboard:	target:25.2.0 target:24.8.0.3 inRelea...
Keywords:

Depends on:
Blocks:	Special-Character
	Show dependency tree / graph

Reported:	2017-08-15 11:58 UTC by Regina Henschel
Modified:	2024-12-12 18:03 UTC (History)
CC List:	6 users (show)

See Also:
Crash report or crash signature:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Regina Henschel 2017-08-15 11:58:36 UTC

I use the special character dialog in its state in Version: 6.0.0.0.alpha0+
Build ID: f1a896c71c495bdef5861eb664581507b6b9b5bb
CPU threads: 4; OS: Windows 6.1; UI render: default; 
TinderBox: Win-x86@42, Branch:master, Time: 2017-08-13_07:38:19
Locale: de-DE (de_DE); Calc: group

Situation: I want a double-struck A, I know the code point U+1D538, but I do not know the character name and do not know which font has the character.

My Idea: I enter the code point into the field and scroll the font list.

Problem: If the current font does not has the character and I scroll the font list using the arrow key, then the number is destroyed and therefore a wrong subset is shown; even when I reach a font, which has the character.

The problem does not exists, when you start with a font, which has this character. Then the number sticks and you can find other fonts, which have this character too by scrolling the font list.

Help for testing: The fonts "Cambria Math", "DejaVu Sans", "DejaVu Serif", "FreeSerif", "Linux Libertine G" or "STIX" contain the character. Of cause you can use another character of the Plane 1 too for testing, e.g. U+1F609, an Emoji.

Comment 1 V Stuart Foote 2017-08-15 16:39:32 UTC

This is valid. But I don't think we should dump it onto Akshay's GSOC 2017 plate.

Would need to toggle the search to be within a single font, or some composite of on system fonts.

For an example of how this could be handled, have a look at Andrew West's work on BableMap [1] where he provides mapping for a Single font or for a Composite font. 

Also provides a "Font Coverage..."  that shows which fonts provide glyph for codepoint.

=-ref-=

[1] http://www.babelstone.co.uk/Software/BabelMap.html

Comment 2 Heiko Tietze 2017-08-15 17:28:27 UTC

Very unlikely that you know the hex/decimal value but not within what font you find the character. I'd say WF because search is not the same as quick access, i.e. hex/decimal values.

However, the same happens when search is active and I suggested to disable the value fields in this case.

Comment 3 Regina Henschel 2017-08-15 18:32:01 UTC

(In reply to Heiko Tietze from comment #2)
> Very unlikely that you know the hex/decimal value but not within what font
> you find the character. I'd say WF because search is not the same as quick
> access, i.e. hex/decimal values.

You can get a code point very easily, not only by searching in the unicode charts but by tools like http://shapecatcher.com/. Of cause you can enter the character name (see below), but that is cumbersome compared with entering the number.

It is likely, that you do _not_ know the font. For example write U+232c and toggle it to unicode. I bet, you get a benzene ring by font fall back. Now tell me which font it is and whether you own other fonts which have this character.

(In reply to Heiko Tietze from comment #2) 
> However, the same happens when search is active and I suggested to disable
> the value fields in this case.

If you have entered the character name correctly and so the search is active, then scrolling the font list works, although the field does not show the correct number. It is good, that this works, so please make no changes, which would disable this feature.

Comment 4 Heiko Tietze 2017-08-15 18:44:58 UTC

(In reply to Regina Henschel from comment #3)
> (In reply to Heiko Tietze from comment #2) 
> > However, the same happens when search is active and I suggested to disable
> > the value fields in this case.
> 
> If you have entered the character name correctly and so the search is
> active, then scrolling the font list works, although the field does not show
> the correct number. It is good, that this works, so please make no changes,
> which would disable this feature.

Comment to patch 12 at https://gerrit.libreoffice.org/#/c/40563/ ('happens' -> 'happened')

Shapecatcher returns a huge label with the name (where we do offer a outstanding feature) and a tiny unicode value. Only super nerds care about the unicode id primarily.

Comment 5 V Stuart Foote 2017-08-15 19:57:26 UTC

(In reply to Heiko Tietze from comment #2)
> Very unlikely that you know the hex/decimal value but not within what font
> you find the character. I'd say WF because search is not the same as quick
> access, i.e. hex/decimal values.
> 
> However, the same happens when search is active and I suggested to disable
> the value fields in this case.

@Heiko--I just have to say how ill-considered that statement is--especially as you've already unilaterally had Akshay remove the "Characters:" edit view destroying a functional IME we have had for inputting strings in alternate language scripts. 

And now you are saying that Unicode values are not important for search--and for font selection.  The Special Character dialog _is_ the sames as OS provided quick access--it is _our_ visual interface for providing quick access.

Please give some thought to how we support polyglot users who are very dependent on Unicode and need efficient tools for locating glyphs across font families by Unicode name or Codepoint or Unicode subset.

Comment 6 Heiko Tietze 2017-08-15 20:22:20 UTC

(In reply to V Stuart Foote from comment #5)
> Please give some thought...

No need to argue, I have no data. You say people remember the unicode id, like it was back in DOS times with ASCII 132 for ä, rather than searching by the name 'umlaut' or the correct term 'DIAERESIS'. 
And I'm not saying it never happens. If I'd use unicode 0x228 for the ä, and wheel through the fonts it should work like with a search term, which is the fact until I reach OpenSymbol. Scrolling now through this font, changes the unicode as expected. 
Turning the use case around, Regina's workflow, the font for a char that is not commonly integrated could be found. That means, when the font doesn't contain of this item nothing is selected in the chars table. But the unicode fields have a value. Today the table corresponds to the id field.
To me the benefits of this workflow don't outweigh the drawback of inconsistency.

Comment 7 V Stuart Foote 2017-08-15 20:35:29 UTC

(In reply to Heiko Tietze from comment #6)
> (In reply to V Stuart Foote from comment #5)
> > Please give some thought...
> 
> No need to argue, I have no data. You say people remember the unicode id,
> like it was back in DOS times with ASCII 132 for ä, rather than searching by
> the name 'umlaut' or the correct term 'DIAERESIS'. 
> And I'm not saying it never happens. If I'd use unicode 0x228 for the ä, and
> wheel through the fonts it should work like with a search term, which is the
> fact until I reach OpenSymbol. Scrolling now through this font, changes the
> unicode as expected. 

We support folks with our <Alt>+X toggle (bug 73691) and do not integrate OS provided "deadkeys" IME (bug 71176 or bug 42437) so it is a bit disingenuous to suggest now that we don't steer our polyglots toward Unicode. 

> Turning the use case around, Regina's workflow, the font for a char that is
> not commonly integrated could be found. That means, when the font doesn't
> contain of this item nothing is selected in the chars table. But the unicode
> fields have a value. Today the table corresponds to the id field.
> To me the benefits of this workflow don't outweigh the drawback of
> inconsistency.

And the other side of the design case is that we have collapsed empty cells in the Unicode table representation. By doing that we obscure ability to determine the font is missing glyphs/graphemes for Codepoints within a Unicode block.

It makes for concise chart, but destroys one of the strengths of Unicode of being able to select a front from drop list to determine if the font has coverage of the needed glyphs by looking at its table in 15 col hex.

After the GSOC '17 is finished -- I'd move to see the additional support of Unicode based 15 column charts organized/showing HEX value and composite font searches.

Comment 8 Adolfo Jayme Barrientos 2017-08-15 21:33:22 UTC

> You say people remember the unicode id

For the record, I do remember Unicode code point numbers, because I’m a typesetter and frequently need to insert special characters, which is very easy to do with your keyboard in Linux: Ctrl+Shift+U plus the Unicode number (e.g., 2026 for …). It shows on my LibreOffice UI translations, which at the microtypography level are very polished. I realize I’m a “power user”, but we shouldn’t just design for the lowest-common-denominator use case. To me, it’s sensible that if the Special Characters dialog has a search field, it should allow me to enter a Unicode number and be capable to find the character.

Comment 9 Yousuf Philips (jay) (retired) 2017-08-16 00:15:27 UTC

I would assume there are advanced user who do remember unicode numbers, though believe its less likely they don't remember which font its available in, especially if they use them regularly.

Presently we have 3 input field to find a character - the search field for basic users and hex and dec fields for advanced user - and the simplest solution might be to disable input in the hex and dec fields and only have search in the search field. With filtering by hex and dec values in the search field, a user can easily move through the font name list and find which font has that character, similar to how they can do so presently by unicode name.

Comment 10 QA Administrators 2018-08-17 02:38:01 UTC Comment hidden (obsolete)

** Please read this message in its entirety before responding **

To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year.

There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present.

If you have time, please do the following:

Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/

If the bug is present, please leave a comment that includes the information from Help - About LibreOffice.

If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice.

Please DO NOT

Update the version field
Reply via email (please reply directly on the bug tracker)
Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not
appropriate in this case)

If you want to do more to help you can test to see if your issue is a REGRESSION. To do so:
1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from http://downloadarchive.documentfoundation.org/libreoffice/old/

2. Test your bug
3. Leave a comment with your results.
4a. If the bug was present with 3.3 - set version to 'inherited from OOo';
4b. If the bug was not present in 3.3 - add 'regression' to keyword

Feel free to come ask questions or to say hello in our QA chat: https://kiwiirc.com/nextclient/irc.freenode.net/#libreoffice-qa

Thank you for helping us make LibreOffice even better for everyone!

Warm Regards,
QA Team

MassPing-UntouchedBug

Comment 11 V Stuart Foote 2018-08-17 16:08:39 UTC

This remains a valid issue with the Special Character dialog. It need design and dev work.

Comment 12 QA Administrators 2019-08-19 06:58:51 UTC Comment hidden (obsolete)

Dear Regina Henschel,

To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year.

If you have time, please do the following:

Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/

If the bug is present, please leave a comment that includes the information from Help - About LibreOffice.

If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice.

Please DO NOT

2. Test your bug
3. Leave a comment with your results.
4a. If the bug was present with 3.3 - set version to 'inherited from OOo';
4b. If the bug was not present in 3.3 - add 'regression' to keyword

Feel free to come ask questions or to say hello in our QA chat: https://kiwiirc.com/nextclient/irc.freenode.net/#libreoffice-qa

Thank you for helping us make LibreOffice even better for everyone!

Warm Regards,
QA Team

MassPing-UntouchedBug

Comment 13 V Stuart Foote 2019-08-19 11:42:11 UTC

This remains a valid issue with the Special Character dialog. It needs design and dev work.

Comment 14 Samuel Mehrbrodt 2019-08-19 11:53:21 UTC

Pro tip: set Importance to "enhancement" and voilà, "QA Administrators" will stop nagging )

Comment 15 V Stuart Foote 2019-08-19 12:12:21 UTC

Well sure, but I'd argue there were multiple implementation issues when the dialog was reworked under GSOC mentorship.

Issue of OP remains, the SMP codepoint value is truncated allowing search only for a BMP glyph.

Comment 16 Heiko Tietze 2023-06-06 10:27:07 UTC

(In reply to V Stuart Foote from comment #11)
> This remains a valid issue with the Special Character dialog. It need design
> and dev work.

The only reasonable way to search for a code point is to enter it in the search field. Using the hex/dec information of the selected item to search for something would be error-prone and hard to understand. 

We have to detect "U+<number>" and switch the search respectively. No further modification to the UI needed.

I could also imagine a filter for the font list that removes fonts that do not contain the code point or the name entered at search.

Comment 17 Mike Kaganski 2024-08-04 09:56:14 UTC

https://gerrit.libreoffice.org/c/core/+/171458

Comment 18 Commit Notification 2024-08-04 12:53:01 UTC

Mike Kaganski committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/2bb84e874bd17afcf3e417d2e4fc32aaafe841c3

tdf#111816: allow special characters filtering by Unicode value

It will be available in 25.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.

Comment 19 Commit Notification 2024-08-06 11:23:00 UTC

Mike Kaganski committed a patch related to this issue.
It has been pushed to "libreoffice-24-8":

https://git.libreoffice.org/core/commit/2e1dd8634991550865072e15e1ec4da289548642

tdf#111816: allow special characters filtering by Unicode value

It will be available in 24.8.1.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.

Comment 20 V Stuart Foote 2024-08-06 15:56:07 UTC

Working well! E.g. "U+1f30" returns the Cyclone sequence from fonts with coverage of 'Miscellaneous Symbols and Pictographs' UCS block, as well as the U+1F30 "GREEK SMALL LETTER IOTA WITH PSILI" for font that cover the 'Greek Extended' UCS block.

=-testing-=

Version: 25.2.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: ec5235ddfd62ca490d13fbc2c91e740a44f9950e
CPU threads: 8; OS: Windows 11 X86_64 (10.0 build 22631); UI render: Skia/Vulkan; VCL: win
Locale: en-US (en_US); UI: en-US
Calc: threaded

Comment 21 Commit Notification 2024-08-08 19:26:28 UTC

Mike Kaganski committed a patch related to this issue.
It has been pushed to "libreoffice-24-8-0":

https://git.libreoffice.org/core/commit/d70b25458157ae4122caf1e41f6a01680f7647ac

tdf#111816: allow special characters filtering by Unicode value

It will be available in 24.8.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.