Writing systems in LibreOffice and ODF are divided into three disjoint script categories. During layout and rendering, LibreOffice must determine to which category each character belongs, in order to apply the correct formatting. We currently assign these characters to script types using a hard-coded algorithm. A recurring issue is that, sometimes, our algorithm guesses wrong. This usually happens with characters like punctuation, which may be used for different purposes across languages within the same document. We currently don't have any way for users to override our algorithm in these cases. While it wouldn't solve all problems with script assignment, a good start would be to support the style:script-type attribute: Per 20.358 in the the ODF 1.3 specification, "[c]onsumers that can determine script types of Unicode characters may also evaluate the attribute and overwrite the script type they determine for certain character with the value of the attribute". Although this attribute was introduced in ODF 1.0, it was never implemented. Doing so would create a workaround for a significant number of our script assignment issues, and may also unblock some OOXML interop (e.g. w:hint).
Isn't an unimplemented ODF feature a bug rather than an enhancement?
A couple of notes for the benefit of people who don't know what style:script-type is (like myself until very recently) and may be confused. Note 1: In the context of this bug, and related bugs (but not everywhere), here is some relevant term rewriting: When people say they mean the same as if they had said --------------------------------------------------------------- language script (i.e. a distinctive writing system, based on a repertoire of specific elements or symbols, or that repertoire itself; brief Wikipedia definition). written language script language group script type script group script type script category script type (so, we're not talking about languages in the usual sense of the word) Note 2: Several aspects of Character styles and DF are specific to one of the script groups, mentioned by Jonathan in comment #0. An example would be the font family: There are three ODF attributes for that: fo:font-family, style:font-family-asian and style:font-family-complex , and we can in fact see all three of their values in the "Format > Character..." dialog (assuming full RTL-CTL and CJK support has been enabled). But then, which of these font-families is actually to be used for a given character? The one for the script group which LO determines the character belongs to. And this is where style:script-type comes in. It has one possible value for each of the three script groups: latin, rtl-ctl, asian (and a fourth value we won't get into here). If it is set, LO should treat the relevant text as being in that script group, applying the script-group-specific attributes to it; if style:script-type not set - LO can fall back use the heuristic algorithm it now uses. Except - like Jonathan says, this is not what we do. We simply ignore style:script-type and always go for the heuristic. This is not formally a bug, since the spec say that we _may_ use it if we want to; it's not a hard requirement; but it's quite problematic, as can be deduced by reading bug 148257. Bug 148257 is about the user being able to set the script; and this attribute "clinches" it, since we can already set, albeit in a crooked fashion, the choice of script within each of the script groups; if we also set the script group - we've set the language.
(In reply to Jonathan Clark from comment #0) > Although this attribute was introduced in ODF 1.0, it was never implemented. > Doing so would create a workaround for a significant number of our script > assignment issues, and may also unblock some OOXML interop (e.g. w:hint). This also need some researches for current implementations for such characters in LibreOffice, and some documents published by W3C Internationalization (I18n) Activity (https://www.w3.org/International/) and Unicode Consortium.
(In reply to Volga from comment #3) I _think_ I disagree with your comment, but perhaps I'm just misunderstanding it. > This also need some researches for current implementations for such > characters in LibreOffice What do you mean by "implementations of characters"? > and some documents published by W3C > Internationalization (I18n) Activity (https://www.w3.org/International/) and > Unicode Consortium. Which documents? And - why would we need to consult these documents before adding support for script-type?