Bug Hunting Session
Bug 97839 - Special characters dialog Characters: edit buffer has problems composing glyphs, affects both BMP and SMP
Summary: Special characters dialog Characters: edit buffer has problems composing glyp...
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: UI (show other bugs)
Version:
(earliest affected)
5.1.1.1 rc
Hardware: All All
: high normal
Assignee: Caolán McNamara
URL:
Whiteboard: target:5.3.0 target:5.2.0.1 target:5.1.4
Keywords:
Depends on:
Blocks:
 
Reported: 2016-02-13 17:01 UTC by V Stuart Foote
Modified: 2016-10-25 18:54 UTC (History)
9 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description V Stuart Foote 2016-02-13 17:01:33 UTC
Special Character dialog...

With OpenGL disabled (to work around bug 97319) when SEP or higher planes are exposed by OS or CLI launch (work around bug 71603), the Special Characters dialog's Characters: edit buffer is not composing the multi-byte codepoints reliably.

Seems to be using fall-back to system font rather than the glyph from the selected font family. In any case the field is being visually garbled, so not clear what would be selected/copy/pasted.
Comment 1 V Stuart Foote 2016-02-13 17:26:59 UTC
On Windows 10 Pro 64-bit en-US with

Version: 5.1.1.1
Build ID: c43cb650e9c145b181321ea547d38296db70f36e
CPU Threads: 8; OS Version: Windows 6.2; UI Render: default; 
Locale: en-US (en_US)

Version: 5.2.0.0.alpha0+
Build ID: 2b60321b21ff9ada64576f5711950b616b8a25ba
CPU Threads: 8; OS Version: Windows 6.19; UI Render: default; 
TinderBox: Win-x86@42, Branch:master, Time: 2016-02-12_23:49:18
Locale: en-US (en_US)
Comment 2 Buovjaga 2016-02-16 12:43:07 UTC
I don't understand how to reproduce this.
Comment 3 V Stuart Foote 2016-02-16 19:36:54 UTC
(In reply to Beluga from comment #2)
> I don't understand how to reproduce this.

Sorry. Let me lay out clear STR.

Prep 
A. install George Douros' "symbola" font if not already available, available here http://users.teilar.gr/~g1951d/  this provides a font with know Unicode coverage beyond the BMP.

B. also helpful to have an install of BableStone BableMap (Windows)
http://www.babelstone.co.uk/Software/BabelMap.html
or (Gucharmap) for viewing graphemes for the upper Unicode planes

C. download attachment 121921 [details] an ODT with some SEP Unicode used

STR
1. launch recent build of LO master

2. disable OpenGL rendering (Tools -> Options -> View)

3. exit LibreOffice

4. open command window and change directory to libreoffice\program for the build

5. open attachement 121921 from the command line, i.e.

swriter.exe %USERPROFILE%\downloads\tdf71603_Unicode_1F300_tdf92505_arrows_mergedSample.odt

6. note the font fallback should be performed, and no blank unknown characters should be shown in the Liberation Serif section at the top (fall back on Windows 8, 8.1 or 10 will be Segoe UI Symbol

7. exit the document, and open a new blank writer document

8. open the Special Characters dialog

9. change the font to Symbola

10. scroll the code table views into the Supplemental plane 1 beyond U+FFFF

11. note that the code tables are well formed, and glyphs are readable

12. enter 1F550 in the Hexadecimal U+ field to move the code point table to the clock face emoji

13. mouse click on the 
U+1F550 CLOCK FACE ONE OCLOCK

14. notice the glyph is placed into the Characters:  buffer

15. use paste button --> glyph is inserted into document

15. mouse click on the
U+1F552 CLOCK FACE THREE OCLOCK

16. notice the glyps is misplaced into the Characters: buffer and the buffer becomes distorted.

17. add additional glyphs by clicking -- the Characters: edit/paste buffer becomes completely unusable. Any content pasted with the Insert button is garbled.
Comment 4 Buovjaga 2016-02-17 08:53:24 UTC
Reproduced.

Win 7 Pro 64-bit Version: 5.2.0.0.alpha0+
Build ID: a6f876d45bd4e41a7143594a6cb11b6893a0f620
CPU Threads: 4; OS Version: Windows 6.1; UI Render: default; 
TinderBox: Win-x86@39, Branch:master, Time: 2016-02-11_00:07:38
Locale: fi-FI (fi_FI)
Comment 5 V Stuart Foote 2016-05-28 16:53:42 UTC
http://opengrok.libreoffice.org/xref/core/cui/source/dialogs/cuicharmap.cxx

Have an ongoing issue in the Special Character dialog with rendering of glyphs for various fonts onto the "Characters:" text bar.

The selected glyph is being rendered with current system UI font--not the font character map open in the dialog that the selection is being made from. Result is that for codepoints from SMP, or areas of BMP not covered by the current system UI font, the text bar shows place holders--or in the case of Windows some strange partial glyphs and the cursor repositions.

The text picked onto the text bar does correctly insert into the document. But with the text bar rendered in the system UI font--showing place holders and replacement glyphs, rather than the char map selection--the user has to carefully check what was actually inserted into the document.

It does affect Linux and Windows and including both BMP and SMP codepoints.

Seems like this part of the Special Character dialog needs some attention.
Comment 6 Commit Notification 2016-05-29 19:27:52 UTC
Caolán McNamara committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=847cdd8efd0662d61d288a4d944edc30e864d145

Resolves: tdf#97839 a single character may be more than 1 utf-16 code points

It will be available in 5.3.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 7 Commit Notification 2016-05-29 19:29:14 UTC
Caolán McNamara committed a patch related to this issue.
It has been pushed to "libreoffice-5-2":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=f0e35cb2fb6f0f595d44c7a7c01ddaf60b19d642&h=libreoffice-5-2

Resolves: tdf#97839 a single character may be more than 1 utf-16 code points

It will be available in 5.2.0.1.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 8 Commit Notification 2016-05-30 08:40:34 UTC
Caolán McNamara committed a patch related to this issue.
It has been pushed to "libreoffice-5-1":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=b8c52df3452208b97dcc59d010a33de4be70101b&h=libreoffice-5-1

Resolves: tdf#97839 a single character may be more than 1 utf-16 code points

It will be available in 5.1.4.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 9 V Stuart Foote 2016-05-30 20:37:50 UTC
Thanks! The edit bar is now much more stable. We get clean glyphs (when system font covers the code point), and  we get a clean "unknown" glyph for any code points--BMP or SMP--not covered by system font, so insert or copy paste works much better.

Still have the issue that system font, rather than the font family selected active in the character map, is being used for the edit bar.

Have opened a new bug 100148 - Special Characters dialog -- the edit bar displays in system font rather than font family char map in use
Comment 10 V Stuart Foote 2016-05-30 20:38:40 UTC
Verified fixed on Windows 10 Pro 64-bit en-US with
Version: 5.3.0.0.alpha0+
Build ID: 2ad5055145201efe5a244656a0715b391149e825
CPU Threads: 8; OS Version: Windows 6.19; UI Render: default; 
TinderBox: Win-x86@42, Branch:master, Time: 2016-05-29_22:57:40
Locale: en-US (en_US)