126157 – Request for new Koreanic languages

Bug 126157 - Request for new Koreanic languages

Summary: Request for new Koreanic languages

Status:	RESOLVED NOTABUG

Alias:	None

Product:	LibreOffice
Classification:	Unclassified
Component:	Localization (show other bugs)
Version: (earliest affected)	unspecified
Hardware:	All All

Importance:	medium enhancement
Assignee:	Not Assigned

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:	CJK CJK-Korean
	Show dependency tree / graph

Reported:	2019-06-29 12:01 UTC by DaeHyun Sung
Modified:	2020-07-04 11:26 UTC (History)
CC List:	3 users (show)

See Also:
Crash report or crash signature:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description DaeHyun Sung 2019-06-29 12:01:37 UTC

Description:
Korean code support in LibreOffice is only supported by ko-KR.


Korean language is officially used in Both Korea[North and South] and the Yanbian Korean Autonomous Prefecture[연변조선족자치주, 延边朝鲜族自治州/延邊朝鮮族自治州] and Changbai Korean Autonomous County[장백 조선족 자치현, 长白朝鲜族自治县/長白朝鮮族自治縣] of Jilin province, China.



Below is Korean Language's Standard Regulators

1. South Korea[대한민국, Republic of Korea]
국립국어원(國立國語院, National Institute of the Korean Language)
Homepage: https://www.korean.go.kr/

2. North Korea[조선민주주의인민공화국, Democratic People's Republic of Korea]
사회과학원 어학연구소(社會科學院 語學研究所, The Language Research Institute, Academy of Social Science)

3. Mainland China[People's Republic of China]
중국조선어규범위원회 中国朝鲜语规范委员会(China Korean Language Regulatory Commission)

Below is Korean's Language code list.

1. South Korea[대한민국, Republic of Korea]: ko-KR
2. North Korea[조선민주주의인민공화국, Democratic People's Republic of Korea]: ko-KP
3. Mainland China[People's Republic of China]: ko-CN

"ko-KR": "Korean - Republic of Korea"
"ko-KP": "Korean - Democratic People's Republic of Korea"
"ko-CN": "Korean - China"
Refernece
Adobe CJK Type Blog
"GB 12052-89: PRC Standard For Korean" https://blogs.adobe.com/CCJKType/2014/12/gb12052.html

I think LibreOffice should support Korean Language codes, not only "ko-KR"(South Korea) but also both "ko-KP"(North Korea) & "ko-CN(Mainland China).






Actual Results:
only supported Korean Language code "ko-KR"(South Korea)

Expected Results:
can support some Korean Language Codes, not only "ko-KR"(South Korea) but also both "ko-KP"(North Korea) & "ko-CN(Mainland China).


Reproducible: Always


User Profile Reset: No



Additional Info:

Comment 1 DaeHyun Sung 2019-06-29 13:42:49 UTC

Example of "ko-KP" code.
North Korea's Linux distribution ,"RedStar OS(붉은 별 운영체제)"'s Office suite is based on OpenOffice.

I think, It had add Korean Language Code, "ko-KP"

Link: https://www.fastcompany.com/3036046/what-its-like-to-use-north-koreas-red-star-os

Comment 2 DaeHyun Sung 2019-06-29 14:15:04 UTC

Microsoft products already reserved Korean language code, "ko-KP".


https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-lcid/a9eac961-e77d-41a6-90a5-ce1a8b0cdb9c

Korean
1. Language tag: "ko"
Language ID: "0x0012"

2. Language tag: "ko-KR"
Location (or type): Korea
Language ID: "0x0412"

3. Language tag: "ko-KP"
Location (or type): North Korea
Language ID: "0x1000"

Comment 3 Julien Nabet 2019-11-10 16:21:14 UTC

To find code to change, I launched these:
$ git grep -n ko_KR
i18npool/Library_localedata_others.mk:81:       CustomTarget/i18npool/localedata/localedata_ko_KR \
i18npool/source/localedata/localedata.cxx:197:    { "ko_KR",  lcl_DATA_OTHERS },
lingucomponent/source/spellcheck/macosxspell/macspellimp.mm:204:                postspdict.push_back( @"ko_KR" );
sal/osl/unx/nlsupport.cxx:261:    { "5601",           RTL_TEXTENCODING_EUC_KR         }, /* ko_KR.EUC */
sal/osl/unx/nlsupport.cxx:343:    { "EUC-KR",                     RTL_TEXTENCODING_EUC_KR },      /* locale: ko_KR.euckr */
sal/osl/unx/nlsupport.cxx:656:    { "ko_KR.EUC",    RTL_TEXTENCODING_EUC_KR      },
sc/source/filter/oox/numberformatsbuffer.cxx:1539:static const BuiltinFormat spBuiltinFormats_ko_KR[] =
sc/source/filter/oox/numberformatsbuffer.cxx:1776:    { "ko-KR",  "*CJK",     spBuiltinFormats_ko_KR  },  // Korean, South Korea

$ git grep -n ko-KR
connectivity/source/drivers/hsqldb/HDriver.cxx:715:                "ko-KR", "Korean",
extras/CustomTarget_autocorr.mk:36:     ko:ko-KR \
extras/Package_autocorr.mk:36:  acor_ko-KR.dat \
i18nlangtag/source/isolang/MS-LCID.lst:165:0x0412 ko-KR
i18nlangtag/source/languagetag/languagetag.cxx:915:                // ko-KR) which was corrected in canonicalize() hence also in
offapi/com/sun/star/linguistic2/XNumberText.idl:64:        <li>ko-KR : South-Korean</li>
sc/source/filter/oox/numberformatsbuffer.cxx:1776:    { "ko-KR",  "*CJK",     spBuiltinFormats_ko_KR  },  // Korean, South Korea

$ find . -name "*ko_KR*"
./i18npool/source/localedata/data/ko_KR.xml

Perhaps it won't be sufficient but it could be a start.

Comment 4 Eike Rathke 2019-12-12 14:44:17 UTC

We already have a ko-KP mapping with user defined MS-LCID 0x8012.
(@Julien: there is the internal tool i18nlangtag/source/isolang/langid.pl to find out about known language/locale assignments, mappings and support in relevant places, the invocation could be, for example,
SRC_ROOT=$(pwd) ./i18nlangtag/source/isolang/langid.pl ko-KP
or invoke without arguments to get help).
Just we do not have an UI entry for the language list as there is no UI translation and no locale data file or any language tools using it.


(In reply to DaeHyun Sung from comment #2)
> Microsoft products already reserved Korean language code, "ko-KP".
> 3. Language tag: "ko-KP"
> Location (or type): North Korea
> Language ID: "0x1000"
No, they don't. LCID 0x1000 isn't an actual locale code, it's some "do it yourself, we don't care" code. See https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-lcid/926e694f-1797-4418-a922-343d1c5e91a6 (and maybe http://archives.miloush.net/michkap/archive/2015/03/04/8770668856267196302.html)
LCID 0x1000 specifically is not suitable for document interchange as any content attributed by that may be treated completely arbitrary, depending on what the machine the document is processed on thinks 0x1000 might be or not.

Comment 5 DaeHyun Sung 2020-07-04 11:26:47 UTC

(In reply to Eike Rathke from comment #4)
> We already have a ko-KP mapping with user defined MS-LCID 0x8012.
> (@Julien: there is the internal tool i18nlangtag/source/isolang/langid.pl to
> find out about known language/locale assignments, mappings and support in
> relevant places, the invocation could be, for example,
> SRC_ROOT=$(pwd) ./i18nlangtag/source/isolang/langid.pl ko-KP
> or invoke without arguments to get help).
> Just we do not have an UI entry for the language list as there is no UI
> translation and no locale data file or any language tools using it.
> 
> 
> (In reply to DaeHyun Sung from comment #2)
> > Microsoft products already reserved Korean language code, "ko-KP".
> > 3. Language tag: "ko-KP"
> > Location (or type): North Korea
> > Language ID: "0x1000"
> No, they don't. LCID 0x1000 isn't an actual locale code, it's some "do it
> yourself, we don't care" code. See
> https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-lcid/
> 926e694f-1797-4418-a922-343d1c5e91a6 (and maybe
> http://archives.miloush.net/michkap/archive/2015/03/04/8770668856267196302.
> html)
> LCID 0x1000 specifically is not suitable for document interchange as any
> content attributed by that may be treated completely arbitrary, depending on
> what the machine the document is processed on thinks 0x1000 might be or not.

I understand it. 
I think, 
In my opinion, It is difficult for me(South Korean) to confirm the contribution activities of the North Korean people, so I change the status(resolved notabug).


By the way, I submitted and applied to include USCRIPT_JAMO in Korean on LibreOffice core repo.
change USCRIPT_JAMO language value for Unicode
https://gerrit.libreoffice.org/c/core/+/93914
Hardcode script for "Noto" CJK fonts & add USCRIPT_JAMO
https://gerrit.libreoffice.org/c/core/+/97344