Bug 81714 - Other: The new "Language Tag" feature in LO 4.3 works only with non-CTL/CJK scripts
Summary: Other: The new "Language Tag" feature in LO 4.3 works only with non-CTL/CJK s...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Localization (show other bugs)
Version:
(earliest affected)
4.3.0.3 rc
Hardware: All All
: medium enhancement
Assignee: Not Assigned
URL:
Whiteboard: BSA
Keywords:
Depends on:
Blocks: RTL-CTL
  Show dependency treegraph
 
Reported: 2014-07-24 12:05 UTC by EricP
Modified: 2018-10-01 08:17 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
ZIP containing test ODT and screenshots from LOv4304. (145.85 KB, application/zip)
2014-08-24 08:43 UTC, Owen Genat (retired)
Details
Screenshots demonstrating red underlines on Mnong text marked as Khmer (132.24 KB, application/zip)
2014-08-25 08:18 UTC, EricP
Details

Note You need to log in before you can comment on or make changes to this bug.
Description EricP 2014-07-24 12:05:53 UTC
The new "Language Tag" feature introduced in LibreOffice 4.3 does not work with CTL scripts.

For example, Central Mnong [cmo] is written in Vietnam with a Latin script and in Cambodia with a Khmer-based script.

I can mark a selection of text in Latin script as cmo-Latn-VN and it appears to work fine. An attempt to mark a section of text in Khmer script as cmo-Khmr-KH has no effect. Instead it remains stubbornly marked as "Khmer."

And the release notes may indicate that this feature does not yet support CTL languages, though it's somewhat ambiguous.
https://wiki.documentfoundation.org/ReleaseNotes/4.3#Adding_a_new_language_tag



              
Operating System: All
Version: 4.3.0.3 rc
Comment 1 Owen Genat (retired) 2014-08-24 08:43:58 UTC
Created attachment 105187 [details]
ZIP containing test ODT and screenshots from LOv4304.

Confirmed under GNU/Linux using:

- v4.3.0.4 Build ID: 62ad5818884a2fc2e5780dd45466868d41009ec0
- v4.4.0.0.alpha0+ Build ID: e379401618268ed7f7f5885a36b90e1f4f6cd4af TinderBox: Linux-rpm_deb-x86_64@46-TDF, Branch:master, Time: 2014-08-18_05:51:03

As I understand this report, the BCP47 language tag (cmo-Khmr-KH) for Khmer script is not reflected in the status bar for the current selection (shows "Khmer" instead). For Latin script marked "cmo-Latn-VN", this is reflected in the status bar. Refer attached screenshots.

The SIL Khmer Mondulkiri font used in the sample is available at:

http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=mondulkiri
Comment 2 Owen Genat (retired) 2014-08-24 08:45:24 UTC
As per comment 1, status set to NEW.
Comment 3 EricP 2014-08-25 08:16:30 UTC
Yes Owen, you understood my report correctly, and encouraged me to investigate things a bit further (OS X 10.9.4 / LO 4.3.04).

It appears that for any text written in a CTL script, the status bar always displays the name of whichever language is selected in the CTL Font section of the Character Format dialog.

So, if a section of text in Khmer script is marked as cmo-Khmr-KH, but the  CTL Font is specified as Arabic, then the status bar indicates “Arabic”.

It does not seem that this is simply an issue with the status bar. If a Khmer spell checker is installed, then it will show red underlines beneath Central Mnong text marked as cmo-Khmr-KH but with Khmer selected as the CTL Font (Khmer.png). If, however, Arabic (for which no spelling dictionary is installed) is selected, then red underlines are not displayed (Arabic.png).
Comment 4 EricP 2014-08-25 08:18:32 UTC
Created attachment 105223 [details]
Screenshots demonstrating red underlines on Mnong text marked as Khmer
Comment 5 martin_hosken 2014-11-07 09:30:18 UTC
I think what is needed is to change cui/source/inc/chardlg.hxx such that

  SvxLanguageBox*     m_pEastFontLanguageLB;
  SvxLanguageBox*     m_pCTLFontLanguageLB;

become:

  SvxLanguageComboBox*     m_pEastFontLanguageLB;
  SvxLanguageComboBox*     m_pCTLFontLanguageLB;
Comment 6 Eike Rathke 2014-11-07 12:09:03 UTC
There is no way yet to flag an arbitrary language tag to be CTL or CJK. The existing predefined CTL/CJK tags respectively their corresponding LCID values occur in various switch cases to be acted differently upon. Merely changing the mentioned language boxes to SvxLanguageComboBox will not help. This needs further implementation.
Comment 7 martin_hosken 2016-10-21 14:07:52 UTC
Where is the flagging needed?

I don't see anywhere in the code that uses MSLangId to decide whether text is CTL or not. All calls to getScriptType seem to be to break iterators now, and they all use Unicode code points instead. Of course if we did have such instances, perhaps we could move the call over to the languageTag instead and then we could use the script component and give a useful answer there too.
Comment 8 Eyal Rozenberg 2018-09-30 21:41:41 UTC
>  mark ... cmo-Latn-VN ... cmo-Khmr-KH ...

None of these strings are options available in the "Language" combo box (as of v6.2.0.0-alpha). How are we supposed to (try and) make these markings?
Comment 9 EricP 2018-10-01 08:17:07 UTC
(In reply to Eyal Rozenberg from comment #8) 
> None of these strings are options available in the "Language" combo box (as
> of v6.2.0.0-alpha). How are we supposed to (try and) make these markings?

You should be able to type directly into the "Language" combo box, as shown at 
https://wiki.documentfoundation.org/ReleaseNotes/4.3#Adding_a_new_language_tag.

This bug still exists in LO v6.0.