Bug 103468 - Hexadecimal code input with more than four digits sometimes works, sometimes not
Summary: Hexadecimal code input with more than four digits sometimes works, sometimes not
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
5.2.3.1 rc
Hardware: x86-64 (AMD64) Windows (All)
: medium normal
Assignee: Not Assigned
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on: HarfBuzz
Blocks:
  Show dependency treegraph
 
Reported: 2016-10-24 16:49 UTC by Dirk W.
Modified: 2016-11-08 13:42 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
Document with Unicode characters (24.03 KB, application/vnd.oasis.opendocument.text)
2016-10-24 18:29 UTC, Dirk W.
Details
Document with Unicode charakters (11.73 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2016-10-24 18:30 UTC, Dirk W.
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dirk W. 2016-10-24 16:49:18 UTC
It’s a positive note, that hexadecimal code input with more than four digits works in LibreOffice still since version 5.1. On a less positive note, the hexadecimal code input with more than four digits sometimes works, sometimes not. (See also Bug 103217 – PDF export of Unicode characters…)

I cannot comprehend/reproduce why at the same LO-Writer document (.odt) sometimes a Unicode character (more than four digits) is depicted and sometimes not. If not, then it’s also not possible to generate this Unicode character (or others with more than four digits).

If I save such an LO-Writer document (with non-depicted Unicode characters with more than four digits) in *.docx and open this following in Word 2013 all Unicode characters are correctly depicted.
Comment 1 Regina Henschel 2016-10-24 18:00:14 UTC
Please list the code points for some examples from an upper plane, which do not work in LibreOffice, but are shown correctly in Word 2013.
Comment 2 Dirk W. 2016-10-24 18:29:04 UTC
Created attachment 128222 [details]
Document with Unicode characters
Comment 3 Dirk W. 2016-10-24 18:29:31 UTC
Regina, I’m not sure what you mean with „upper plane“ but I list here some examples, which work sometimes in LO-Writer and sometimes not:

U+1D53B	MATHEMATICAL DOUBLE-STRUCK CAPITAL D
U+1D54E	MATHEMATICAL DOUBLE-STRUCK CAPITAL W
U+1D452	MATHEMATICAL ITALIC SMALL E („Euler’s number“)

I attach also two documents: Characters in Unicode.odt and Characters in Unicode.docx – both created with LO-Writer.
Comment 4 Dirk W. 2016-10-24 18:30:16 UTC
Created attachment 128225 [details]
Document with Unicode charakters
Comment 5 Regina Henschel 2016-10-24 19:33:26 UTC
Thanks for the document and examples. The problem is clear now.

When you see a rectangle instead of the character that means, that the selected font has no glyph for the code point. Nevertheless the input method with Alt+x after a U+1nnnn still works.

In my Word 2010 I see the same rectangles with the .docx file as in LibreOffice with the .odt file.

It might be, that Word 2013 has some font substitution. So you need to figure out, which font Word 2013 is actually using and set this font in LibreOffice too.

So I think, this is not really a bug. But it might be an enhancement request, that LibreOffice implements a feature "find a font on my system for this code point", that can be used at least on demand.
Comment 6 Dirk W. 2016-10-24 19:54:55 UTC
Regina, thanks for your detailed comment.

The selected font in the documents – Segoe UI Symbol – has glyph for the code point, and this font was never changed. Sometimes the Unicode character is depicted correctly in LO-Writer, sometimes not – in Word 2013 the Unicode character always is depicted correctly.

Something doesn’t add up and further improvements should be made (See also Bug 103217 – PDF export of Unicode characters…)
Comment 7 Regina Henschel 2016-10-24 20:09:18 UTC
I'm no font expert and cannot verify, whether 'Segoe UI Symbol' really has the needed glyphs. So some else has to look, what goes wrong here. Perhaps it matters, whether it is a TrueType (ttf) or OpenType (otf) font?

The GNU freefont "FreeSerif" has got this double strike glyphs. http://ftp.gnu.org/gnu/freefont/. It is part of the .zip file. For my Windows 7 I have chosen the .ttf variant. With that font I see the glyphs.
Comment 8 Dirk W. 2016-10-24 20:52:40 UTC
The name of the „Segoe UI Symbol“-Font is seguisym.ttf from Win 10 Pro (Build 14393). Detailed informations: Version 6.22, OpenType-Layout, digital signed, TrueType contours.

Although this font was (really) installed, I installed it again. Now the Unicode characters with more than four digits are depicted again (but how long?) and the creating of PDF (from LO-Writer) still doesn’t work.
Comment 9 Dirk W. 2016-10-25 18:47:56 UTC
To exclude any possibilities, which could have something to do with the fonts, I did two things:

First, I deleted (over a boot-stick) nearly all fonts (except Arial, Courier New, Microsoft Sans Serif, Symbol, Tahoma and Times New Roman). Afterwards I cleaned the Registry from all false records concerning the deleted fonts with CCleaner. After restart I installed all the fonts which I have separated in a special file, e.g. also all Segoe UI-Fonts (incl. Segoe UI Symbol, Segoe UI Historic and Segoe UI Emoji).

Result: The Unicode characters (with more than four digits) are not depicted in my document (Characters in Unicode.odt) and it’s also not possible to generate Unicode characters (with more than four digits).

Second: I installed „Arial Unicode MS“ (Version 1.01 from 2012/09/29), which is an „extended version of Monotype’s Arial“ and „contains glyphs for all code points within The Unicode Standard, Version 2.1“.

Result: Also if I replace „Segoe UI Symbol“ in my document through „Arial Unicode MS“, the Unicode characters (with more than four digits) are not depicted and it’s also not possible to generate Unicode characters (with more than four digits).

Summary: the hexadecimal code input with more than four digits sometimes works, sometimes not. Yesterday evening it has been worked again, today – after a very clean reinstall of my fonts – it doesn’t work. Something doesn’t add up and further improvements should be made (See also Bug 103217 – PDF export of Unicode characters…).
Comment 10 Dirk W. 2016-10-27 14:30:07 UTC
To exclude any possibilities, I went further, formatted my partition C:\ and made a fully new  CleanInstall.

Result: The bug persists.
Comment 11 V Stuart Foote 2016-11-08 13:30:44 UTC
Issues with font fall back are resolved fixed with target of 5.3.0 with the new HarfBuzz based text layout for bug 89870 set active by default.

If the font is installed on system, the fonts from the SMP (more than 4 bytes) will show a glyph.
Comment 12 Dirk W. 2016-11-08 13:42:12 UTC
Ok, thank you for this good news.