Description: isCJKIVSCharacter needs to support CJK Unified Ideographs Extension Block C to H for Unicode15 I'm curious about CJK Characters. (I'm Korean. But, I study & can speak both Japanese and Mandarin Chinese a little bit.) After I contribute to support Unicode 15's CJK Unified Ideographs Extension H for GNOME characters, I checked CJK Unified Ideographs Extension Lists on LibreOffice. (GNOME characters commit link: https://gitlab.gnome.org/GNOME/gnome-characters/-/commit/daef901e34d731d6d8fe8a1f966ea9f1f04e3a2f ) However, It doesn't support CJK Unified Ideographs Extension Block C to H. Only supports CJK Unified Ideographs and its Extension Block A, B. In Unicode 15, the CJK Unified Ideographs Extension Block range is here CJK Unified Ideographs: 4E00–9FFF CJK Unified Ideographs Extension A: 3400–4DBF CJK Unified Ideographs Extension B: 20000–2A6DF CJK Unified Ideographs Extension C: 2A700–2B73F CJK Unified Ideographs Extension D: 2B740–2B81F CJK Unified Ideographs Extension E: 2B820–2CEAF CJK Unified Ideographs Extension F: 2CEB0–2EBEF CJK Unified Ideographs Extension G: 30000–3134F CJK Unified Ideographs Extension H: 31350–323AF Ref: https://www.unicode.org/versions/Unicode15.0.0/ch18.pdf I installed the IPAmj Font Character Finder on LibreOffice Link https://extensions.libreoffice.org/en/extensions/show/1077 Japanese MJ character Information table Ver.066.01. https://moji.or.jp/mojikiban/mjlist/ https://moji.or.jp/mojikiban/font/ IPAmj Font Character Finder can look for characters in the CJK Unified Ideographs Extension B or higher range. such as "𫟘󠄂"(U+2B7D8) It's located in CJK Unified Ideographs Extension D. https://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=%F0%AB%9F%98 So, For LibreOffice's CJK Users, We need to support CJK Unified Ideographs Extension Block C to H for Unicode15 Steps to Reproduce: 1. Install the IPAmj Font Character Finder extensions on LibreOffice Link https://extensions.libreoffice.org/en/extensions/show/1077 2. input the 'MJ060164' for MJcode or '2B7D8'UCS code 3. If the selected CJK font supports CJK Unified Ideographs Extension Block C to H, It can show. Actual Results: If the selected CJK font supports CJK Unified Ideographs Extension Block C to H, It can show. Expected Results: If the selected CJK font supports CJK Unified Ideographs Extension Block C to H, It can show. Reproducible: Always User Profile Reset: No Additional Info:
@Khaled, *, Not sure what is meant by support here. Assume that if LO receives the Unicode, and select a font with coverage, that it renders to LO document canvas--and can save and print same. So is this an IME question or inability to render? Does our Special Character Dialog render chart of these codepoints? Or is this more simply against the "IPAmj Font Charactor Finder" extension [1]? =-ref-= [1] https://extensions.libreoffice.org/en/extensions/show/1077
(In reply to V Stuart Foote from comment #1) > =-ref-= > [1] https://extensions.libreoffice.org/en/extensions/show/1077 This from the extension's page (via Google Translate) Description The IPAmj font is one of the fonts developed and distributed by the IPA Independent Administrative Institution Information-technology Promotion Agency. This is a huge font set containing approximately 60,000 characters, including variant characters, used for personal names in municipalities throughout Japan. This extension can search IPAmj fonts by various items including variant characters (IVS) and paste them into documents via the clipboard. ・The IPAmj Mincho font must be installed on the system. - The document must be formatted with the IPAmj Mincho font. The MJ character information list included in this extension is distributed by the following organizations, and CC-BY-SA is applied. IPA Information-technology Promotion Agency https://mojikiban.ipa.go.jp/1311.html Character Information Technology Promotion Council https://moji.or.jp/mojikiban/mjlist/
I read the related source code(include/i18nutil/unicode.hxx) commit log. https://git.libreoffice.org/core/+/c1399e497191f295b9c3db95d126ff6a4fa5891d%5E%21 ``` (e.g., later versions of Unicode have added CJK Extension C--F code blocks, which the current implementation of isCJKIVSCharacter does not reflect) ``` Currently, Unicode 15 is listed, and the Unicode IVS's characters already exist CJK Extensions C-H blocks. On the LibreOffice source(include/i18nutil/unicode.hxx), IVS is only supported up to CJK Extension B blocks. However, currently, it's not just CJK Extension B, it's to Extension H from Unicode 15. So, For LibreOffice's CJK Users, We need to support CJK Unified Ideographs Extension Block C to H for Unicode15. This following repo 'IVS Test'(https://github.com/adobe-fonts/ivs-test ) describes. ``` This font supports all current and future CJK Unified Ideographs by covering entire blocks: U+3400 through U+4DBF (Extension A), U+4E00 through U+9FFF (URO), U+FA0E, U+FA0F, U+FA11, U+FA13, U+FA14, U+FA1F, U+FA21, U+FA23, U+FA24, U+FA27 through U+FA29 (CJK Unified Ideographs in the CJK Compatibility Ideographs block), U+20000 through U+2A6DF (Extension B), U+2A700 through U+2F7FF (Extension C, Extension D, Extension E, Extension F and beyond), U+2FA20 through U2FFFD (the end of Plane 2), and U+30000 through U+3FFFD (Extension G and the remainder of Plane 3). ```
The below links are Unicode IVS/IVD descriptions and data links. Description https://www.unicode.org/reports/tr37/ data link https://www.unicode.org/ivd/data/2022-09-13/ The Ideographic Variation Database consists of two data files. The first, IVD_Collections.txt records the registered collections. The second, IVD_Sequences.txt records the registered sequences. https://www.unicode.org/ivd/data/2022-09-13/IVD_Collections.txt https://www.unicode.org/ivd/data/2022-09-13/IVD_Sequences.txt Korean IVS: KRName collection https://www.unicode.org/ivd/data/2022-09-13/IVD_Charts_KRName.pdf Japanese IVS: Moji_Joho collection https://www.unicode.org/ivd/data/2022-09-13/IVD_Charts_Moji_Joho.pdf such as "𫟘󠄂"(U+2B7D8) It's located in CJK Unified Ideographs Extension D. https://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=%F0%AB%9F%98 Also, It contains in the Moji_Joho collection document. https://www.unicode.org/ivd/data/2022-09-13/IVD_Charts_Moji_Joho.pdf
To be honest, I have hard time understanding what is the issue here. Please give clear steps on how to reproduce the issue and what is the expected result, preferably attaching ODF files and screenshots if applicable.
@ DaeHyun Sung 1) Is the following part correct? They look the same. >Actual Results: If the selected CJK font supports CJK Unified Ideographs Extension Block C to H, it can show. >If the selected CJK font supports CJK Unified Ideographs Extension Block C to H, It can show. >Expected Results: If the selected CJK font supports CJK Unified Ideographs Extension Block C to H, it can show. >If the selected CJK font supports CJK Unified Ideographs Extension Block C to H, It can show. 2) ``` sudo apt install fonts-ipamj-mincho ``` Have you performed the above?