Created attachment 79782 [details] patch for the core/ directory Dear all, I would like to add Tibetan in the list of available languages (I'm currently writing a hunspell dictionnary). I made a git patch for the core/ directory (attached), adding the locales and a few more little things. I just had a problem: in the files oovbaapi/ooo/vba/word/WdLanguageID.idl oovbaapi/ooo/vba/office/MsoLanguageID.idl the constants associated with Tibetan have number 1105 while it is 2121 in l10ntools/source/ulfconv/msi-encodinglist.txt Is it normal? Also, I would like to be able to select Tibetan in the list Tools->Options->Languages Settings->Languages->CTL. How should I add it? I wasn't able to find... As for my previous bug 64926, I could not test it... Thank you, -- Elie
1105 is for China 2121 is for Bhutan
Oh, ok then, thank you! Do you think someone will review the patches I sent?
Hi Elie - thanks for your work - this is great :-) The best way to get patches merged is to mail the developer list, or to push them to gerrit; but this is almost as good. Andras - any chance you could look into this ? Thanks for contributing !
Thank you! Re-thinking about this, I remember I have modified /i18npool/source/nativenumber/data/numberchar.h by adding a lines for Tibetan, but I don't really know how this is called so it might well have no effect... this should be easy to improve for people who know the code though. I've started https://github.com/eroux/tibetan-spellchecker, the .dic is dumb right now, but tomorrow or monday it will get a very complete list of words. When it will be more stable, I'll make a bugreport for its inclusion (certainly in a few weeks).
Elie Roux committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=0114e1c85bc42fd3bd2a3d0aa33f77f67093b66b fdo#64977 Adding Tibetan Language Support The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Fridrich Strba committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=fceb821c473112727520c0952607f8377b62f417 Revert "fdo#64977 Adding Tibetan Language Support" The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Created attachment 80507 [details] corrected bo_CN.xml file This is the corrected bo_CN.xml file (see comments)
Created attachment 80508 [details] corrected bo_IN.xml file
Created attachment 80509 [details] bo_charset.txt file (same as dzongkha)
Dear all, I just saw that the patch was accepted then reverted. Re-reading it, I realize there was several mistakes in it: - in the locale files, the CURRENCY formats were wrong for a part of it - in the locale files, the weekdays were like in the dzongkha calendar, where it should be shifted of one - the patch for numberchar.txt was stupid So I attached the new corrected files. Should I propose it on gerrit? Also, as the bo_charset.txt file is a copy of dz_charset, is there a way to just make the locale files point to dz_charset? Thank you, -- Elie
Do I get it right, that you submitted the new version to gerrit as <https://gerrit.libreoffice.org/#/c/4197/>?
Absolutely, I'm sorry I didn't make a link from here! Thank you,
one for Eike to track I think :-)
The numbers under oovbaapi/ are a very limited subset used only by the VBA compatibility API. The actual MS-LangIDs used by LibreOffice are in include/i18nlangtag/lang.h and mappings to ISO codes in i18nlangtag/source/isolang/isolang.cxx See also https://wiki.documentfoundation.org/LibreOffice_Localization_Guide/Adding_a_New_Language_or_Locale The bo 2121 in l10ntools/source/ulfconv/msi-encodinglist.txt seems to be wrong and probably should be 1105 instead for bo-CN, for bo-BT it could be 2129. 0x0849 (2121) actually is not related to Tibetan at all but assigned to Tamil Sri Lanka ta-LK. However, there's some confusion in the MS assignments of "Tibetan" in their IDs, 0x0851 (2129) is used for Tibetan_Bhutan and Dzongkha, see https://issues.apache.org/ooo/show_bug.cgi?id=40713 and https://issues.apache.org/ooo/show_bug.cgi?id=53497 (unfortunately not reachable at the moment of this writing so I couldn't look up details). To be able to support both bo-CN and bo-IN we need to add a LangID for bo-IN I'll try to sort things out and prepare to be able to apply your patch.
Oh, thank you *very much* for the pointer to the documentation, that's what I was missing! There is an error in the doc: i18nlangtag/inc/i18nlangtag/lang.h should be replaced by include/i18nlangtag/lang.h Also, taking a look quickly, I wonder if Tibetan and Dzongkha should appear in MsLangId::needsSequenceChecking of mslangid.cxx?... what does it do exactly? Apart from this, the only file that seem to need changes is langtab.src... but shouldn't Dzongkha be added to this file too? Should I provide the patch for langtab.src? Thank you!
Taking a closer look: the only change langtab.src would need is to replace < "Tibetan (PR China)" ; LANGUAGE_TIBETAN ; > ; by < "Tibetan" ; LANGUAGE_TIBETAN ; > ; as Tibetan is also spoken in India (several states including Sikkhim, Ladakh and Zanskar), Nepal, Bhutan, etc. Also, why isn't Tibetan in the list of Complex scripts in Options->Langages Parameters->Languages? How is this list generated?
"Also, taking a look quickly, I wonder if Tibetan and Dzongkha should appear in MsLangId::needsSequenceChecking of mslangid.cxx?... what does it do exactly?" Probably not. Sequence checking is where "invalid" character combinations are rejected when the user inputs them. Thai and Khmer are the classic cases. Best to default to avoiding marking a language as needing sequence checking unless there's a strong reason otherwise.
(In reply to comment #15) > There is an error in the doc: > > i18nlangtag/inc/i18nlangtag/lang.h > > should be replaced by > > include/i18nlangtag/lang.h Already corrected, thanks to Andras :-) The header files were recently moved. > Apart from this, the only file that seem to need changes is langtab.src... > but shouldn't Dzongkha be added to this file too? It is there, line 216: < "Dzongkha" ; LANGUAGE_DZONGKHA ; > ; (In reply to comment #16) > the only change langtab.src would need is to replace > > < "Tibetan (PR China)" ; LANGUAGE_TIBETAN ; > ; > > by > > < "Tibetan" ; LANGUAGE_TIBETAN ; > ; No, that entry needs to stay as is, but an additional entry will be needed < "Tibetan (India)" ; LANGUAGE_TIBETAN_INDIA ; > ; the LANGUAGE_TIBETAN_INDIA constant needs to be added to lang.h first and the ISO code mapping be added, that's what I referred earlier which needs to be sorted out and I'll do. > Also, why isn't Tibetan in the list of Complex scripts in Options->Langages > Parameters->Languages? The languages listed there for the default document language list boxes appear only if fully supported, i.e. locale data exists. Currently you can see Tibetan only in the character attribution dialog, e.g. in Writer Format->Character->Font CTL Font list. > How is this list generated? Classification is obtained from MsLangId::getScriptType() in i18nlangtag/source/isolang/mslangid.cxx
http://cgit.freedesktop.org/libreoffice/core/commit/?id=ad3105a2933aff80b8fd471d32c0846440a508c5 Adds the necessary LangID and mapping for bo-IN, wrong fdo# in commit summary though (number of the OOo issue mentioned above) so it didn't show up automatically in this bug, don't worry ...
Elie Roux committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=c56f9b76693d0b7f43234afb58796338dcd52489 fdo#64977 Adding Tibetan Language Support The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.