Description: LibreOffice Draw is unable to properly display emojis from at least some PDF files. If you open a file containing emojis, it's displayed as different-colored replacement characters (�). If you export the file as a PDF again, then open the exported file in another viewer, it also shows up as replacement characters. Steps to Reproduce: Open a certain PDF file in LibreOffice draw. I will attach a file that reproduces the bug. Actual Results: Emoji are displayed as a bunch of colored Unicode Replacement Characters. Like this, but in different colors: ����. Expected Results: Emoji are displayed as faces, animals, etc., in the same manner as when the PDF is opened in another Viewer. Reproducible: Always User Profile Reset: No OpenGL enabled: Yes Additional Info: Version: 7.3.6.2 / LibreOffice Community Build ID: 30(Build:2) CPU threads: 4; OS: Linux 5.15; UI render: default; VCL: gtk3 Locale: en-US (en_US.UTF-8); UI: en-US Ubuntu package version: 1:7.3.6-0ubuntu0.22.04.1 Calc: threaded
Created attachment 183041 [details] I typed a small HTML to show a Web page with emojis. I used this to create emoji-test.pdf by opening the HTML file in Firefox and printing it to a file.
Created attachment 183042 [details] Then I printed the HTML file to PDF in Firefox, creating this file.
Created attachment 183043 [details] How the file looks when I open it in Draw
Created attachment 183044 [details] How the file looks when I open it in Atril PDF reader (Correct rendering)
Unfortunately, the HTML file itself got corrupted when I uploaded it. Looks like I hit an encoding bug while trying to report an encoding bug.
Created attachment 183045 [details] What happens why I open emoji-test.pdf in Draw, then export it as a PDF again. Now it looks wrong in any viewer.
I was able to reproduce this on another computer with the LibreOffice profile reset. It was also running LO 7.3.6.2 on Ubuntu. This eliminates the config files as a cause.
Confirm the problem in Version: 7.5.0.0.alpha0+ / LibreOffice Community Build ID: 55ee3ede2bb0211e895053ed3a54bb1c99cc94ca CPU threads: 4; OS: Linux 5.15; UI render: default; VCL: kf5 (cairo+xcb) Locale: ru-RU (ru_RU.UTF-8); UI: en-US Calc: threaded Okular opens the PDF file correctly
I am quite sure this is a font substitution issue in the sdext.pdfimport filter. What font are you using for the emoji? The pdf says it uses CairoFont-0-0 and CairoFont-1-1. When open in Draw, the font name is italic due to bug 143095. I don't have CairoFont installed so I don't know whether the emoji comes back when you manually set the font to CairoFont.
Created attachment 183256 [details] HTML source of emoji test, gzipped to avoid text encoding issues
(In reply to Kevin Suo from comment #9) > I am quite sure this is a font substitution issue in the sdext.pdfimport > filter. What font are you using for the emoji? The pdf says it uses > CairoFont-0-0 and CairoFont-1-1. When open in Draw, the font name is italic > due to bug 143095. I don't have CairoFont installed so I don't know whether > the emoji comes back when you manually set the font to CairoFont. Are you saying we need to see what happens when we specify the font within the PDF, or when I change the font after opening it? Changing the font of the text after I open it doesn't help. I then explored changing the font of the PDF by specifying it in the HTML. This has strange behavior. If I switch the HTML to a serif font, the PDF I make by printing it in Firefox also has a serif font when I open it in my default PDF viewer (Atril). When I open it in Draw, it looks the same (sans-serif).
This sounds like an encoding issue, possibly the PDF importer is mishandling surrogate pairs and they end up converted to replacement character which is often used in encoding errors.
The PDF is actually broken, the ToUnicode of the emoji font embedded in the PDF maps everything to U+FFFD (�), so the text representation is unrecoverable.
Is this a bug in Firefox or the font used?
(In reply to Thomas Szymczak from comment #14) > Is this a bug in Firefox or the font used? Most likely Firefox.