Description: i18npool/source/collator/data/ko_charset.txt file's Korean Hangul syllables ordering is wrong. Some hangul syllables are dissapeared on the text file. Hangul Syllable ordering is already specified on Unicode Code chart, Hangul Syllables Range: AC00–D7AF https://unicode.org/charts/PDF/UAC00.pdf Also, It is not include Korean Hangul jamo(alphabet) and some Korean Hanja lists is not in. Steps to Reproduce: i18npool/source/collator/data/ko_charset.txt file's Korean Hangul syllables ordering is wrong. Some hangul syllables are dissapeared on the text file. Actual Results: <가<각 <간<갇<갈<갉<갊<감<갑<값<갓<갔 <강<갖<갗<같<갚<갛<개<객<갠<갤 <갬<갭<갯<갰<갱<갸<갹<갼<걀<걋 Expected Results: <가<각<갂<갃<간<갅<갆<갇<갈<갉 <갊<갋<갌<갍<갎<갏<감<갑<값<갓 <갔<강<갖<갗<갘<같<갚<갛<개<객 <갞<갟<갠<갡<갢<갣<갤<갥<갦<갧 <갨<갩<갪<갫<갬<갭<갮<갯<갰<갱 <갲<갳<갴<갵<갶<갷<갸<갹<갺<갻 <갼<갽<갾<갿<걀<걁<걂<걃<걄<걅 <걆<걇<걈<걉<걊<걋 Reproducible: Always User Profile Reset: No Additional Info: Hangul Syllable ordering is already specified on Unicode Code chart, Hangul Syllables Range: AC00–D7AF https://unicode.org/charts/PDF/UAC00.pdf Also, It is not include Korean Hangul jamo(alphabet) and some Korean Hanja lists is not in.
See also https://git.libreoffice.org/core/+/2d843bb104a3091a2ff2c7b4d5655f5fb1393a47 Looks like the file is only used for ICU < 53
I submitted the fixed Korean Hangul Syllables ordering text file. https://gerrit.libreoffice.org/c/core/+/87018 But, It only fixed Korean Hangul Syllables range. Hangul Syllables Range: AC00–D7AF https://unicode.org/charts/PDF/UAC00.pdf It is not include Korean Hangul jamo(alphabet) and some Korean Hanja lists is not in.
(In reply to Mike Kaganski from comment #1) > See also > https://git.libreoffice.org/core/+/2d843bb104a3091a2ff2c7b4d5655f5fb1393a47 > > Looks like the file is only used for ICU < 53 I think, ICU < 53 code snippet's korean collator text file origin from KSX1001 specification. KS X 1001(former Specification name was KS C 5601) only support Korean syllables character, 2350 characters. new ICU's Hangul syllables support 11172 characters, but, KS X 1001 only support 2350 characters. (Since Unicode 2.0, Unicode and ICU can support Korean syllables, 11172 characters. Former ICU < 53 users also use Unicode, and support 11172 characters, but, these code only support 2350 characters. so, In my opinion, For Korean users, It have to change the text file.
But as Mike said, in builds against ICU 53 or later the file is not used anymore and we're hoping for ICU treating things correctly meanwhile. The change may make sense when building against ICU 52 or earlier. If it is to be used also with later and current ICUs then it would need additional work. See i18npool/Library_collator_data.mk and commit message of https://gerrit.libreoffice.org/plugins/gitiles/core/+/2d843bb104a3091a2ff2c7b4d5655f5fb1393a47%5E%21/
DaeHyun Sung committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/b3363960f97dcb7eaa10dfa708d71198a345924c fix Korean Hangul Syllable Character order tdf#130067 It will be available in 7.0.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Fixed the issues, so It resolved. https://git.libreoffice.org/core/commit/b3363960f97dcb7eaa10dfa708d71198a345924c fix Korean Hangul Syllable Character order tdf#130067 It will be available in 7.0.0.