Bug 163000 - Wrong UTF-8 encoded Chinese characters (Skia)
Summary: Wrong UTF-8 encoded Chinese characters (Skia)
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
24.8.1.2 release
Hardware: All macOS (All)
: medium normal
Assignee: Patrick (volunteer)
URL:
Whiteboard: target:25.2.0 target:24.2.7 target:24...
Keywords:
Depends on:
Blocks: Skia-macOS
  Show dependency treegraph
 
Reported: 2024-09-17 02:31 UTC by elian
Modified: 2024-10-02 21:26 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
Sample Calc document with Chinese UTF-8 display bug (273.76 KB, application/vnd.oasis.opendocument.spreadsheet)
2024-09-17 13:37 UTC, elian
Details
Snapshot on macOS Sequoia (1.31 MB, image/png)
2024-09-17 21:15 UTC, Patrick (volunteer)
Details
Snapshot of TextEdit toolbar (20.14 KB, image/png)
2024-09-17 22:19 UTC, elian
Details
Adding Liberation Sans to system solves tab display, but dialog box still incorrect (159.60 KB, image/png)
2024-09-25 18:58 UTC, elian
Details
Displays correctly when Skia is disabled. (791.07 KB, image/png)
2024-09-26 03:50 UTC, Ken Chou
Details
Displays incorrectly when Skia is enabled. (854.88 KB, image/png)
2024-09-26 03:50 UTC, Ken Chou
Details
Skia enabled, with forced software rendering selected (351.00 KB, image/png)
2024-09-26 17:41 UTC, elian
Details
PingFang SC draws wrong glyphs running Skia with "這是中文 Hello" text in Calc cell (56.18 KB, image/png)
2024-09-26 20:35 UTC, Patrick (volunteer)
Details
Calc document for testing PingFang SC font (11.23 KB, application/vnd.oasis.opendocument.spreadsheet)
2024-10-02 21:23 UTC, Patrick (volunteer)
Details
Calc document for testing DejaVu Sans font (11.46 KB, application/vnd.oasis.opendocument.spreadsheet)
2024-10-02 21:24 UTC, Patrick (volunteer)
Details
Calc document for testing Hiragina Sans W3 font (11.42 KB, application/vnd.oasis.opendocument.spreadsheet)
2024-10-02 21:25 UTC, Patrick (volunteer)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description elian 2024-09-17 02:31:41 UTC
Description:
Tabs do not display Chinese correctly anymore. When I edit the tab that is displaying incorrect characters, and copy the string to Writer, the correct characters are displayed.

This could be a general UTF-8 bug since the wrong characters are displayed regardless of whether I input traditional or simplified Chinese characters.

Steps to Reproduce:
1. Enter a mixed English and Chinese string for a tab name.
2. Save, and observe that the characters are incorrect.
3. Edit tab name, select string, copy to a blank writer document.
4. Observe that the characters are correct.

Actual Results:
Wrong characters displayed.

Expected Results:
Calc tab names should be displaying the characters that were entered.


Reproducible: Always


User Profile Reset: No

Additional Info:
Version: 24.8.1.2 (AARCH64) / LibreOffice Community
Build ID: 87fa9aec1a63e70835390b81c40bb8993f1d4ff6
CPU threads: 10; OS: macOS 15.0; UI render: Skia/Raster; VCL: osx
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded
Comment 1 Xisco Faulí 2024-09-17 07:29:52 UTC
Thank you for reporting the bug. Please attach a sample document, as this makes it easier for us to verify the bug. 
I have set the bug's status to 'NEEDINFO'. Please change it back to 'UNCONFIRMED' once the requested document is provided.
(Please note that the attachment will be public, remove any sensitive information before attaching it. 
See https://wiki.documentfoundation.org/QA/FAQ#How_can_I_eliminate_confidential_data_from_a_sample_document.3F for help on how to do so.)
Comment 2 Ming Hua 2024-09-17 12:36:49 UTC
Can not reproduce on Windows 11 with
Version: 24.8.1.2 (X86_64) / LibreOffice Community
Build ID: 87fa9aec1a63e70835390b81c40bb8993f1d4ff6
CPU threads: 12; OS: Windows 11 X86_64 (10.0 build 22631); UI render: Skia/Raster; VCL: win
Locale: zh-CN (zh_CN); UI: en-US
Calc: CL threaded

I could rename the sheet to Chinese names and the tab displays the Chinese characters just fine.

I suspect this is locale related (as the reporter was using en-US locale) and/or macOS specific.
Comment 3 elian 2024-09-17 13:37:44 UTC
Created attachment 196506 [details]
Sample Calc document with Chinese UTF-8 display bug

I have also embedded snapshots of the dialog box immediately after Chinese text entry and resulting tab.
Comment 4 elian 2024-09-17 13:42:36 UTC
@Ming Hua Yes, my locale is set to en_US.UTF-8, but this display bug is new, but I'm not sure when it was introduced. Prior versions of LO were just fine.

The use case is obvious... UTF-8 should display correctly regardless of locale setting, as long as it is an UTF-8 locale. In my case, I work with the English UI, but enter Chinese sometimes.
Comment 5 Patrick (volunteer) 2024-09-17 21:15:01 UTC
Created attachment 196516 [details]
Snapshot on macOS Sequoia
Comment 6 Patrick (volunteer) 2024-09-17 21:24:38 UTC
(In reply to Patrick (volunteer) from comment #5)
> Created attachment 196516 [details]
> Snapshot on macOS Sequoia

I cannot reproduce this on my Silicon Mac running macOS Sequoia (see snapshot in attachment #196516 [details]).

AFAICT, the image of the Rename dialog in attachment #196506 [details] appears to be using a different font than I see in the Rename dialog so maybe the bug isn't due to UTF-8 encoding, but is due to a bug in LibreOffice's text layout.

Not sure how to determine which font LibreOffice is using to for the Rename dialog and tabs. If you copy the text in the Rename dialog to a new, empty document in the TextEdit application and then put the cursor in the middle if the pasted text, what font does TextEdit show in its toolbar area? On my machine, TextEdit sets the font to Hiragino Sans W3.
Comment 7 elian 2024-09-17 22:19:39 UTC
Created attachment 196518 [details]
Snapshot of TextEdit toolbar

Snapshot of TextEdit toolbar area, MacOS Sequoia
Comment 8 elian 2024-09-17 22:21:36 UTC
@Patrick, my TextEdit toolbar area shows PingFang TC. Interesting that we show different fonts.
Comment 9 Patrick (volunteer) 2024-09-17 23:14:44 UTC
(In reply to elian from comment #8)
> @Patrick, my TextEdit toolbar area shows PingFang TC. Interesting that we
> show different fonts.

I agree. We both are running the same version of LibreOffice on Sequoia on Mac Silicon yet we see different fonts for Rename dialog and tabs. So I wonder what is different between our machines?

Are we using the same macOS Sequoia beta version? I am running the latest available macOS Sequoia Public Beta with build number 24A335. Are you running a different Public beta or Developer beta build?

BTW, here is a copy of my LibreOffice > About info:

Version: 24.8.1.2 (AARCH64) / LibreOffice Community
Build ID: 87fa9aec1a63e70835390b81c40bb8993f1d4ff6
CPU threads: 8; OS: macOS 15.0; UI render: Skia/Metal; VCL: osx
Locale: en-CA (en_CA.UTF-8); UI: en-US
Calc: threaded
Comment 10 elian 2024-09-18 00:24:00 UTC
Patrick, my About info says:

Version: 24.8.1.2 (AARCH64) / LibreOffice Community
Build ID: 87fa9aec1a63e70835390b81c40bb8993f1d4ff6
CPU threads: 10; OS: macOS 15.0; UI render: Skia/Raster; VCL: osx
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded

So same build, but different locales.
Comment 11 Ming Hua 2024-09-18 02:53:43 UTC
(In reply to elian from comment #0)
> Reproducible: Always
> 
> User Profile Reset: No
> 
@elian, please first make sure you can reproduce this bug in safe mode.

Safe mode can be accessed via menu "Help > Restart in Safe Mode..."
Comment 12 elian 2024-09-18 04:03:01 UTC
@Ming Hua the bug does NOT appear in safe mode. I do not have any extensions installed. I have tried resetting to factory settings, no joy.
Comment 13 Patrick (volunteer) 2024-09-18 21:14:47 UTC
(In reply to elian from comment #12)
> @Ming Hua the bug does NOT appear in safe mode. I do not have any extensions
> installed. I have tried resetting to factory settings, no joy.

This is just a wild guess, but I wonder if you have a duplicate or corrupted font on your machine.

If you launch the /Applications/Font Book application, select "All Fonts" in the left sidebar, and select the File > Resolve Duplicates menu item, are any fonts marked as duplicates? If yes, do you see any change if you deactivate the duplicate fonts and restart LibreOffice?

Also, if you select "All Fonts", click on a font, select the Edit > Select All menu item, and then select the File > Validate Selection, are any fonts listed as having problems? If yes, do you see any change if you deactivate the fonts with problems and restart LibreOffice?
Comment 14 elian 2024-09-19 00:11:13 UTC
@Patrick, I don't see any font issues at all, using your suggested steps.

So I think this appears to be some kind of bug that crops up when not in safe mode. I'll try to dig down a bit more to see if I can isolate the source of the problem.
Comment 15 elian 2024-09-25 18:50:58 UTC
I think the problem is with LO's font substitution system. The default font in Calc is Liberation Sans, but I did not have it installed on my Mac. So it substituted something else.

On a hunch, I downloaded and installed Liberation Sans into my system. This seems to have solved the problem!
Comment 16 elian 2024-09-25 18:58:19 UTC
Created attachment 196693 [details]
Adding Liberation Sans to system solves tab display, but dialog box still incorrect

I spoke too soon. Adding Liberation Sans to my system does solve the tab display problem, the the dialog box is still wrong. Snapshot shows the state of the tab and dialog box.
Comment 17 Ken Chou 2024-09-26 03:50:18 UTC
Created attachment 196705 [details]
Displays correctly when Skia is disabled.
Comment 18 Ken Chou 2024-09-26 03:50:45 UTC
Created attachment 196706 [details]
Displays incorrectly when Skia is enabled.
Comment 19 Ken Chou 2024-09-26 03:55:37 UTC
I have the same issue with Chinese UTF-8 characters displaying as garbled text. 
I discovered that it seems to be caused by Skia:
 - When I disable Skia, the issue disappears.  attachment: https://bugs.documentfoundation.org/attachment.cgi?id=196705
 - When I enable Skia, the issue reappears. attachment: https://bugs.documentfoundation.org/attachment.cgi?id=196706
Please see the attached file.


My Version info:

Version: 24.8.1.2 (AARCH64) / LibreOffice Community
Build ID: 87fa9aec1a63e70835390b81c40bb8993f1d4ff6
CPU threads: 10; OS: macOS 15.0; UI render: default; VCL: osx
Locale: zh-CN (zh_Hans.UTF-8); UI: en-US
Calc: threaded


Note the UI render information.
Comment 20 elian 2024-09-26 04:11:11 UTC
Ken, you are right. Disabling Skia rendering stops the UTF-8 display issue, not just on the tabs, but also in the cells of a spreadsheet. That's it!
Comment 21 Ming Hua 2024-09-26 04:34:33 UTC
Set as NEW according to comment 19.
Comment 22 Patrick (volunteer) 2024-09-26 13:47:45 UTC
What happens if you check both of the following Skia checkboxes and restart?:

- Use Skia for all rendering
- Force Skia software rendering

Checking the second checkbox runs Skia in a different mode called Skia/Raster whereas checking only the first checkbox runs Skia using your machine's GPU called Skia/Metal.

Do you see the bug when running Skia/Raster (i.e. both checkboxes checked)? Or do you only see the bug with Skia/Metal?
Comment 23 elian 2024-09-26 17:41:10 UTC
Created attachment 196729 [details]
Skia enabled, with forced software rendering selected

No change to rendering error when Skia is enabled with forced software rendering.

Version: 24.8.1.2 (AARCH64) / LibreOffice Community
Build ID: 87fa9aec1a63e70835390b81c40bb8993f1d4ff6
CPU threads: 10; OS: macOS 15.0; UI render: Skia/Raster; VCL: osx
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded
Comment 24 Patrick (volunteer) 2024-09-26 20:35:56 UTC
Created attachment 196733 [details]
PingFang SC draws wrong glyphs running Skia with "這是中文 Hello" text in Calc cell

I found that if I change the font to PingFang SC in cell A2 in attachment #196506 [details], I see the cell's "這是中文 Hello" text become mangled. Like the other commenters found, PingFang SC font only does this when running with Skia/Metal or Skia/Raster. If Skia is disabled, the PingFang SC draws the text correctly.

So now the question is why Skia cannot handle drawing the PingFang SC glyphs. Since that font draws correctly when Skia is disabled, the font is likely not the problem and there is a bug somewhere deep in the Skia source code.

When I have some spare time, I will see if I can find a possible cause for this bug.
Comment 25 Patrick (volunteer) 2024-09-26 20:59:44 UTC
(In reply to Patrick (volunteer) from comment #24)
> Created attachment 196733 [details]
> PingFang SC draws wrong glyphs running Skia with "這是中文 Hello" text in Calc
> cell
> 
> I found that if I change the font to PingFang SC in cell A2 in attachment
> #196506 [details], I see the cell's "這是中文 Hello" text become mangled. Like
> the other commenters found, PingFang SC font only does this when running
> with Skia/Metal or Skia/Raster. If Skia is disabled, the PingFang SC draws
> the text correctly.

Note for myself: PingFang TC has the same bug bug PingFang HK does not. Also, when exporting to PDF, any PingFang SC or TC text is not shown in the exported PDF.
Comment 26 Patrick (volunteer) 2024-09-28 20:59:36 UTC
I think I have found the root cause of this bug: there are duplicate PingFang fonts bundled with macOS Sequoia. One set of PingFang fonts is a downloaded Type 3 bitmap font file (i.e. the PingFang fonts shown in the Font Book application) and the other set is a pre-installed TrueType font.

So what I think is happening is that LibreOffice is getting the PingFang SC/TC/HK/MO glyph numbers for a given text from the Type 3 bitmap font but then when using Skia or exporting to PDF, LibreOffice tries to extract the bitmaps for those glyph numbers from the TrueType font.

Not sure how to debug and fix this yet. But for now, here are the details of the two font files:

Type 3 bitmap font file:
    /System/Library/AssetsV2/com_apple_MobileAsset_Font7/3419f2a427639ad8c8e139149a287865a90fa17e.asset/AssetData/PingFang.ttc
    Font Descriptor Downloadable/ed: 1 1
    Font Descriptor Format Enum: 1

TrueType font file:
    /System/Library/PrivateFrameworks/FontServices.framework/Resources/Reserved/PingFangUI.ttc
    Font Descriptor Downloadable/ed: 0 0
    Font Descriptor Format Enum: 3
Comment 27 Patrick (volunteer) 2024-09-29 12:32:52 UTC
I have a fix for this bug in the following patch. I am just waiting for the patch to be reviewed:

https://gerrit.libreoffice.org/c/core/+/174160

I found that macOS Sequoia added the following new system font that contains an alternate set of TrueType PingFang fonts. Unfortunately, LibreOffice cannot handle this font when drawing using Skia or when exporting to PDF. macOS Sequoia already has a separate set of bitmap PingFang fonts so skip this new font so that the bitmap fonts are used:

/System/Library/PrivateFrameworks/FontServices.framework/Resources/Reserved/PingFangUI.ttc
Comment 28 Commit Notification 2024-09-29 20:57:47 UTC
Patrick Luby committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/8f3e84133628c420b7cc9896d6e92e2d66eae0b2

tdf#163000 don't add any fonts in the system "reserved fonts" folder

It will be available in 25.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 29 Patrick (volunteer) 2024-09-29 20:59:56 UTC
I have committed a fix this bug. The fix should be in tomorrow's (30 September 2024) nightly master builds:

https://dev-builds.libreoffice.org/daily/master/current.html

Note for macOS testers: the nightly master build installer does not overwrite any LibreOffice official versions. Instead, it will be installed as a separate application called "LibreOfficeDev" in the /Applications folder.

Because this is a "test" build, you will need to do the following steps before you launch the LibreOfficeDev application:

1. Go to the Finder and navigate to the /Applications/Utilities folder
2. Launch the "Terminal" application
3. Paste the following command in the Terminal application window and press the Return key to execute the command:

   xattr -d com.apple.quarantine /Applications/LibreOfficeDev.app
Comment 30 Commit Notification 2024-09-30 01:25:16 UTC
Patrick Luby committed a patch related to this issue.
It has been pushed to "libreoffice-24-2":

https://git.libreoffice.org/core/commit/1b83c52c0de927e6eb5c6f03eecc6e477c8a36ea

tdf#163000 don't add any fonts in the system "reserved fonts" folder

It will be available in 24.2.7.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 31 ⁨خالد حسني⁩ 2024-09-30 10:46:08 UTC
This font does not have a glyf, CFF ir CFF2 tables (the three glyph outline tables in OpenType). It seems to have a new and undocumented hvgl table. I guess when skia is used it does the rendering on its own and it can’t handle this table, while when it is not used CoreText/CoreGraphics are used and they know how to handle it. See: https://gitlab.freedesktop.org/freetype/freetype/-/issues/1281

I suggest checking for this table specifically instead of hard coding file path as that might change.
Comment 32 Patrick (volunteer) 2024-09-30 13:16:24 UTC
(In reply to ⁨خالد حسني⁩ from comment #31)
> This font does not have a glyf, CFF ir CFF2 tables (the three glyph outline
> tables in OpenType). It seems to have a new and undocumented hvgl table. I
> guess when skia is used it does the rendering on its own and it can’t handle
> this table, while when it is not used CoreText/CoreGraphics are used and
> they know how to handle it. See:
> https://gitlab.freedesktop.org/freetype/freetype/-/issues/1281
> 
> I suggest checking for this table specifically instead of hard coding file
> path as that might change.

Thank you for the analysis. I'll work on a patch that checks in the hvgl table exists in the font instead of the path of the font.

A side benefit of your approach is that it should also detect any problematic fonts included in iOS 18 without me needing to find and hardcode the path for such fonts.
Comment 33 Patrick (volunteer) 2024-10-01 00:07:32 UTC
OK. I have uploaded a patch that implements Khaled's suggestion:

https://gerrit.libreoffice.org/c/core/+/174305
Comment 34 Commit Notification 2024-10-02 12:55:10 UTC
Patrick Luby committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/a4e9584c554ea018691b2c97d38cce3d83f8ea9a

tdf#163000 don't add any fonts with an 'hvgl' font table

It will be available in 25.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 35 Patrick (volunteer) 2024-10-02 13:04:00 UTC
I have committed my rewritten fix for this bug. The fix should be in tomorrow's (02 October 2024) nightly master builds:

https://dev-builds.libreoffice.org/daily/master/current.html

Note for macOS testers: the nightly master build installer does not overwrite any LibreOffice official versions. Instead, it will be installed as a separate application called "LibreOfficeDev" in the /Applications folder.

Because this is a "test" build, you will need to do the following steps before you launch the LibreOfficeDev application:

1. Go to the Finder and navigate to the /Applications/Utilities folder
2. Launch the "Terminal" application
3. Paste the following command in the Terminal application window and press the Return key to execute the command:

   xattr -d com.apple.quarantine /Applications/LibreOfficeDev.app
Comment 36 Commit Notification 2024-10-02 14:00:20 UTC
Patrick Luby committed a patch related to this issue.
It has been pushed to "libreoffice-24-8":

https://git.libreoffice.org/core/commit/a6c02c2bdebc196e5e7113aecbfd8d2debf4bb06

tdf#163000 don't add any fonts with an 'hvgl' font table

It will be available in 24.8.3.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 37 Commit Notification 2024-10-02 15:48:35 UTC
Patrick Luby committed a patch related to this issue.
It has been pushed to "libreoffice-24-2":

https://git.libreoffice.org/core/commit/2a30aa03b63cf1598a16f3fa165f06cafc9ec6fa

tdf#163000 don't add any fonts with an 'hvgl' font table

It will be available in 24.2.7.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 38 Patrick (volunteer) 2024-10-02 21:23:54 UTC
Created attachment 196856 [details]
Calc document for testing PingFang SC font
Comment 39 Patrick (volunteer) 2024-10-02 21:24:21 UTC
Created attachment 196857 [details]
Calc document for testing DejaVu Sans font
Comment 40 Patrick (volunteer) 2024-10-02 21:25:24 UTC
Created attachment 196858 [details]
Calc document for testing Hiragina Sans W3 font
Comment 41 Patrick (volunteer) 2024-10-02 21:26:33 UTC
Just for completeness, I have attached the files that I used to test and debug this bug.