Bug 167364 - Khitan, Tangut and Nüshu should be treated as Asian texts
Summary: Khitan, Tangut and Nüshu should be treated as Asian texts
Status: UNCONFIRMED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
25.8.0.0 alpha0+
Hardware: All All
: medium enhancement
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: CJK
  Show dependency treegraph
 
Reported: 2025-07-03 09:23 UTC by Volga
Modified: 2025-07-03 17:07 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Volga 2025-07-03 09:23:46 UTC
Description:
Since these scripts were already introduced in Unicode, and they are all inspired by Chinese Characters, it's reasonable to make them treated as Asian texts.

Unicode 9.0 have decleared the following addition:

Tangut is a very large siniform ideographic script. It is the first siniform ideographic script encoded after the Han (CJK) script. Its implementation requires technology support similar to that used for CJK, including very large fonts and radical/stroke input methods. Special adjustments have also been made to the Unicode Collation Algorithm to account for the introduction of another large ideographic repertoire.

Unicode 10.0 have decleared the following addition:

NushuSources.txt. This file contains normative information on the source references for Nüshu characters. The file format is similar to the format of the Unihan data files and TangutSources.txt. Implementations which support that format for Unihan or Tangut data should be able to add support for Nüshu data in a similar manner.

Unicode 13.0 have declared the following addition:

The Khitan Small Script is a new ideographic script, encoded for the first time in Version 13.0.0. This is the fourth ideographic script (after Han, Tangut, and Nushu) to use the range notation in UnicodeData.txt. 

So it's clear that these scripts should be worked as other CJK texts, at least they should be appeared in Asian Text Fonts properties.

Steps to Reproduce:
N/A

Actual Results:
N/A

Expected Results:
N/A


Reproducible: Always


User Profile Reset: No

Additional Info:
https://www.unicode.org/versions/enumeratedversions.html
https://en.wikipedia.org/wiki/Chinese_family_of_scripts
Comment 1 V Stuart Foote 2025-07-03 17:07:00 UTC
Didn't this get a tweak to set Tangut, and Khitan scripts to ODF ScriptType::ASIAN[1][2] from 7.6 release.

So only Nüshu needs the same.

Or is there something more involved with TBRL TBLR or some BT oddity involved that need handling in edit engines/doc shells?

=-ref-=
[1]
https://opengrok.libreoffice.org/xref/core/i18nutil/source/utility/unicode.cxx?r=58a7c6ccfd3fa590460dba1ecbdef4483dcd5e08#284
[2] https://gerrit.libreoffice.org/c/core/+/153372