Bug 116731 - Unicode pictographs are no longer shown
Summary: Unicode pictographs are no longer shown
Status: RESOLVED NOTOURBUG
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
6.0.2.1 release
Hardware: All Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: bibisected, bisected, regression
Depends on:
Blocks:
 
Reported: 2018-04-01 19:13 UTC by Alberto Salvia Novella
Modified: 2021-03-05 15:28 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Alberto Salvia Novella 2018-04-01 19:13:54 UTC
Description:
Unicode pictographs are no longer shown

Steps to Reproduce:
In any kind of document paste a pictograph character:
http://unicode.org/charts/nameslist/c_1F300.html

Actual Results:  
The character isn't shown.

Expected Results:
The character to be shown.


Reproducible: Always


User Profile Reset: Yes


OpenGL enabled: Yes

Additional Info:
- It used to work.
- It works in non LibreOffice apps.
- Source where you copy the char from doesn't affect the end result.
- Disabling hardware acceleration or font anti-allising does nothing.
- Likely result of the transition to color pictographs.


User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36
Comment 1 V Stuart Foote 2018-04-01 23:18:11 UTC
Can not confirm on Windows builds of current master.

Special character dialog, set a font with SMP coverage (e.g. Emoji One color, Symbola) pick place glyphs onto document canvas. Still no color glyph support or the experimental Emoji toolbar (but 105689) on Windows--but any SMP outline glyphs are fine.

🙈🙉🙊


Version: 6.1.0.0.alpha0+ (x64)
Build ID: 655b9054bc265de377c3dc411e2ef40cdfd16dce
CPU threads: 4; OS: Windows 10.0; UI render: GL; 
TinderBox: Win-x86_64@42, Branch:master, Time: 2018-03-27_03:21:52
Locale: en-US (en_US); Calc: CL
Comment 2 Alberto Salvia Novella 2018-04-02 03:48:26 UTC
My operating system is Manjaro Deepin, which is a Linux rolling distribution with the most updated non beta software:

https://manjaro.org/community-editions/
Comment 3 Xisco Faulí 2018-04-03 10:51:19 UTC
it seems to be only linux ...

Regression introduced by:

author	Michael Stahl <mstahl@redhat.com>	2017-09-07 23:01:26 +0200
committer	Michael Stahl <mstahl@redhat.com>	2017-09-07 23:22:11 +0200
commit fc670f637d4271246691904fd649358ce2e7be59 (patch)
tree 0eee10cd701f0479d4ed8ca7287defefef6af29e
parent 554a79d793ee9546f71802643b79001749c3c695 (diff)
svtools: HTML import: don't put lone surrogates in OUString
The bytes "ed b3 b5" in fdo67610-1.doc (which, as the name indicates,
is an HTML file) are converted to the lone UTF-16 surrogate "dcf5",
which is inserted into SwTextNode and causes asserts later on.

The actual encoding of the HTML document is probably GBK (at least
VIM doesn't display any missing characters with that), but
because it doesn't contain any indication of its encoding
it's apparently imported as UTF-8; the ImplConvertUtf8ToUnicode()
thinking a surrogate code point is valid even if the JSON-compatible
mode RTL_TEXTENCODING_JAVA_UTF8 is not specified is a bit of a
surprise.

Bisected with: bibisect-linux64-6.0

Adding Cc: to Michael Stahl
Comment 4 QA Administrators 2019-04-04 03:04:08 UTC Comment hidden (obsolete)
Comment 5 Alberto Salvia Novella 2019-04-09 11:18:04 UTC
Still present in version 6.2.2.2.
Comment 6 Julien Nabet 2020-01-19 12:44:44 UTC
I'm not sure to understand the bug here.
On pc Debian x86-64 with master sources updated today + gtk3 rendering, I could copy paste:

🙈🙉🙊
or
🎃
🏆
in a brand new file in Writer.
Idem for kf5 and gen rendering.

Idem on LO Debian package 6.3.4 with gtk3 rendering.

What did I miss?
Comment 7 Alberto Salvia Novella 2020-01-19 22:00:19 UTC
Screencast:
https://youtu.be/eXqBSrJ6Uzk
Comment 8 V Stuart Foote 2020-01-19 22:42:15 UTC
(In reply to Alberto Salvia Novella from comment #7)
> Screencast:
> https://youtu.be/eXqBSrJ6Uzk

@Alberto,

Notice in your screen cast that at the end of your paste that the cursor positioned into the 3 gylphs is showing Liberation Mono--of course there is no coverage of the Unicode SMP glyphs in that font.

Position cursor after each "place holder" and issue an <Alt>+X to toggle from glyph to Unicode point. You will probably see correct values of U+1f648 U+1f649 U+1f64a for those glyphs. Meaning the correct unicode is being pasted, but your os/DE has faulty fall back replacement.

Simple to work around--select the 3 placeholder glyphs then apply a font that has coverage of these SMP glyphs.  The Emoji One Color delivered with LibreOffice has coverage of those symbols from the Supplemental Multilingual Plane and should be available.
Comment 9 Alberto Salvia Novella 2020-01-19 23:09:38 UTC
Result:
https://youtu.be/czPrSYIxhrk
Comment 10 Kimkarshian 2021-03-05 14:17:59 UTC
hey I am trying to add Unicode pictographs in the site https://jcpenneyassociatekiosk.live/ . But the problem is i am able o create but it is nto showing. How to solve this ?
Comment 11 Alberto Salvia Novella 2021-03-05 15:28:19 UTC
You shall install:
https://gitlab.com/es20490446e/emoji.conf

With an emoji font, like noto-emoji or even better twemoji.