Bug 151906 - If documents have a "document language", allow viewing and setting it
Summary: If documents have a "document language", allow viewing and setting it
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
7.4.1.2 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL: https://docs.oasis-open.org/office/Op...
Whiteboard:
Keywords:
Depends on:
Blocks: Styles Languages
  Show dependency treegraph
 
Reported: 2022-11-04 16:24 UTC by Eyal Rozenberg
Modified: 2023-12-06 23:19 UTC (History)
8 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Eyal Rozenberg 2022-11-04 16:24:00 UTC
Do (Writer) document have a "document language"?

* If you consider the LO options, Language Setting > Languages, you'll notice you can set a "default language" for each of the support language groups (Western, RTL-CTL, Asian). 
* ... but there is no setting of the default language group.
* If you look for a language setting in the document properties or default page style - you won't find it.

I don't know what the ODF spec defines, but assuming that a default document language is defined, there need to be UI both to view it and to set it. And that means both the language group and the choice of language group.

If a document does _not_ have a language property, then the options dialog should clarify what exactly the user si
Comment 1 Eyal Rozenberg 2022-11-04 16:44:40 UTC
If a document does _not_ have a language property, then the options dialog should clarify what exactly setting the "default language" means.
Comment 2 m_a_riosv 2022-11-05 01:35:42 UTC
https://help.libreoffice.org/latest/en-US/text/shared/guide/language_select.html?DbPAR=SHARED#bm_id3083278

In fact, document language it's the default, from where Default Paragraph Style and Character Style inherit the Font language. Changing document language, those change their Font language, if no other has been selected by hand.

On the status bar, there is visible the language to where the cursor is.
Comment 3 Regina Henschel 2022-11-05 02:17:53 UTC
(In reply to Eyal Rozenberg from comment #0)
> Do (Writer) document have a "document language"?
...
> 
> I don't know what the ODF spec defines

Relevant attributes in ODF 1.3 are: fo:language (20.202), fo:country (20.188), fo:script(20.222) for Latin scripts, style:country-asian (20.256),  style:language-asian (20.302), style:script-asian (20.356) for East Asian scripts and style:country-complex (20.257), style:language-complex (20.303), style:script-complex (20.357) for RTL scripts. In addition the style:script-type (20.358) attribute may be used.

These attributes belong to a style:text-properties (16.29.29) element. Such element is primarily used in a <style:style> (16.2) element of family text. Such corresponds to character styles in the UI. In addition it can be used in a <style:style> element of other family types. Those are used, if a property is not defined in the character style. And as last step it can be used in a <style:default-style>. That is used if the property was not found in the other mentioned elements.

If you set a language in Tools>Options, that setting goes into the style:text-properties elements in the <style:default-style> elements. So you can consider using the <style:default-style> elements as a way to set a "document" language. If you do not set a dedicated language in Tools>Options, but use the "Default - ..." item, then no language is written to the <style:default-style> elements and it depends on the environment of the LibreOffice installation in which the document is opened, which language is considered as "document" language.
Comment 4 Eyal Rozenberg 2022-11-05 09:13:27 UTC
(In reply to Regina Henschel from comment #3)
> (In reply to Eyal Rozenberg from comment #0)
> > Do (Writer) document have a "document language"?
> ...
> > 
> > I don't know what the ODF spec defines
> 
> Relevant attributes in ODF 1.3 are: fo:language (20.202), fo:country
> (20.188), fo:script(20.222) for Latin scripts, style:country-asian (20.256),
> style:language-asian (20.302), style:script-asian (20.356) for East Asian
> scripts and style:country-complex (20.257), style:language-complex (20.303),
> style:script-complex (20.357) for RTL scripts. In addition the
> style:script-type (20.358) attribute may be used.

... so, no attribute saying which of the language groups is "chosen".

> These attributes belong to a style:text-properties (16.29.29) element. Such
> element is primarily used in a <style:style> (16.2) element of family text.
> Such corresponds to character styles in the UI. In addition it can be used
> in a <style:style> element of other family types. Those are used, if a
> property is not defined in the character style. And as last step it can be
> used in a <style:default-style>. That is used if the property was not found
> in the other mentioned elements.
> 
> If you set a language in Tools>Options, that setting goes into the
> style:text-properties elements in the <style:default-style> elements. So you
> can consider using the <style:default-style> elements as a way to set a
> "document" language. If you do not set a dedicated language in
> Tools>Options, but use the "Default - ..." item, then no language is written
> to the <style:default-style> elements and it depends on the environment of
> the LibreOffice installation in which the document is opened, which language
> is considered as "document" language.

But - once a document has been created, with is <style:default-style>, the settings in Tools | Options no longer affects it, right?

Also, in the UI, if we edit the "Default Paragraph Style"'s character aspects, I'm guessing that will affect the document's <style:default-style>?
Comment 5 m_a_riosv 2022-11-05 10:32:02 UTC
(In reply to Eyal Rozenberg from comment #4)
> (In reply to Regina Henschel from comment #3)
> > (In reply to Eyal Rozenberg from comment #0)
> > .....
> 
> But - once a document has been created, with is <style:default-style>, the
> settings in Tools | Options no longer affects it, right?

Have you tested?, It does for me.
Comment 6 Eyal Rozenberg 2022-11-05 11:30:02 UTC
(In reply to m.a.riosv from comment #5)
> Have you tested?, It does for me.

Yes, you're right. Filed bug 151918.
Comment 7 Regina Henschel 2022-11-05 12:19:49 UTC
(In reply to Eyal Rozenberg from comment #4)
> (In reply to Regina Henschel from comment #3)
> > Relevant attributes in ODF 1.3 are: fo:language (20.202), fo:country
> > (20.188), fo:script(20.222) for Latin scripts, style:country-asian (20.256),
> > style:language-asian (20.302), style:script-asian (20.356) for East Asian
> > scripts and style:country-complex (20.257), style:language-complex (20.303),
> > style:script-complex (20.357) for RTL scripts. In addition the
> > style:script-type (20.358) attribute may be used.
> 
> ... so, no attribute saying which of the language groups is "chosen".

The chosen script type depends on the unicode code points of the characters. LibreOffice uses the ICU library https://icu.unicode.org/. 

> 
> Also, in the UI, if we edit the "Default Paragraph Style"'s character
> aspects, I'm guessing that will affect the document's <style:default-style>?

No, it affects the paragraph style "Standard". But when you use Tools>Language>'For All Text', that affects the <style:default-style> elements.
Comment 8 Heiko Tietze 2022-11-09 09:41:40 UTC
Per Tools > Options you define some attributes for the "Default Paragraph Style" (and derived PS). I see no issue with that.
Comment 9 Eyal Rozenberg 2022-11-22 22:50:16 UTC
(In reply to Heiko Tietze from comment #8)
> Per Tools > Options you define some attributes for the "Default Paragraph
> Style" (and derived PS). I see no issue with that.

I don't understand your comment.

In bug 151020, you said wrote that:

> the "document language" coming from tools > options is 
> applied to the Default PS and subordinate PS, and should
> be used for the category labels.

so, in particular, you claim that there is such a thing as a document language. But like I said in my opening comment - such a setting cannot be found in the LO UI.

So is there, or isn't there, a document language?
Comment 10 Heiko Tietze 2022-11-23 07:50:18 UTC
There is no viewable Document Language in terms of an attribute that goes into the document. You define the font name and size via T>O for the Default PS (unless you load a template). The workflow is straight-forward, and showing settings in T>O in the UI is pointless. => NAB
Comment 11 Eyal Rozenberg 2022-11-23 08:14:36 UTC
(In reply to Heiko Tietze from comment #10)
> There is no viewable Document Language in terms of an attribute that goes
> into the document. 

Is there a non-viewable document language? If not, I'll close this issue and ask you to clarify your comment on the other bug.
Comment 12 Regina Henschel 2022-11-23 13:07:56 UTC
There exists not 1 language but 3 languages, one for script LATIN, one for script ASIAN and one for script COMPLEX.
Which language is chosen depends on the script of the Unicode character.

The attributes are
fo:language 20.202
style:language-asian 20.302
style:language-complex 20.303

Because it is allowed to ignore the attributes if not at the same time a country is specified, you need in addition
fo:country 20.188
style:country-asian 20.256
style:country-complex 20.257
LibreOffice has the language bundled with the country. You can distinguish between German(German) and German(Austria), for example.

These go into the
<style:text-properties> 16.29.29
within a
<style:default-style> 16.4
It is allowed for several values of attribute family, e.g. for family="paragraph".

ODF 1.3 recommends to write <style:default-style> elements for all kinds of family which are actual used in the document. But it is not mandatory and LibreOffice does not write them in all cases.

These six attributes are set in Tools > Options. You can force writing the default styles by touching the fields, which means changing the content and maybe change back. If you want to be sure, that the default styles are written, you should use a document template which has them.

I can think of an option "Always store default language settings to document". Having them in the document resolves ambiguity which might arise in world-wide exchange of documents.
Comment 13 V Stuart Foote 2022-11-23 16:51:33 UTC
There is utility to establishing ISO 15836 compliant dc:language in the meta.xml for the ODF.

IMHO best location in UI would be on one of the tabs of the Properties dialog (File -> Properties).

The initial value on creation should be taken from the user profile (Tools -> Options -> Language Settings -> Languages), or from template if set. Either by locale, or user's UI language.  And not draw default from Western/CTL/CJK setting by default.

Exposing it as Dublin Core in the Properties dialog should provide both ability to identify, but also to edit the value.

A change made to the Properties dialog should then propagate into the appropriate style.xml

The language widget on the Status bar should continue to refer to current Paragraph, or selection, while the properties dc: entry to the document as a whole.

The see also bug 39937 is germane as to need/utility and workflow of setting a language for the document for accessibility.
Comment 14 V Stuart Foote 2022-11-23 17:09:00 UTC
(In reply to V Stuart Foote from comment #13)
> There is utility to establishing ISO 15836 compliant dc:language in the
> meta.xml for the ODF.

See ODF 1.3 4.2.3.15
Comment 15 Eyal Rozenberg 2022-11-25 15:03:09 UTC
(In reply to Regina Henschel from comment #12)
> There exists not 1 language but 3 languages, one for script LATIN, one for
> script ASIAN and one for script COMPLEX.
> Which language is chosen depends on the script of the Unicode character.


(In reply to V Stuart Foote from comment #13)
> There is utility to establishing ISO 15836 compliant dc:language in the
> meta.xml for the ODF.

So, there is such a thing as a document language, but we don't respect it or set it and the UI doesn't refer to it. And it's not what Heiko was talking about in bug 151020. And bug 151020 doesn't make sense, because we have no known "document language" to choose for the captions. Great...

I officially declare this language business to be a hot mess.
Comment 16 Dieter 2023-12-06 08:22:01 UTC
(In reply to Regina Henschel from comment #12)
> I can think of an option "Always store default language settings to
> document". Having them in the document resolves ambiguity which might arise
> in world-wide exchange of documents.

(In reply to V Stuart Foote from comment #13)
> There is utility to establishing ISO 15836 compliant dc:language in the
> meta.xml for the ODF.
> The see also bug 39937 is germane as to need/utility and workflow of setting
> a language for the document for accessibility.

Heiko, I'm not sure, that I've understood all details, but if you take into account the comments mentioned above, isn't it a valid enhancement request?
Comment 17 Heiko Tietze 2023-12-06 10:21:17 UTC
(In reply to Dieter from comment #16)
> if you take into account the comments mentioned above, isn't it a valid enhancement request?
Apparently it is possible but to me the above is tech-talk to me and I see no use case that is solved with such setting. It will rather add confusion. But anyway, the majority has spoken.
Comment 18 Eyal Rozenberg 2023-12-06 21:35:30 UTC
(In reply to Heiko Tietze from comment #17)
> Apparently it is possible but to me the above is tech-talk to me and I see
> no use case that is solved with such setting. It will rather add confusion.
> But anyway, the majority has spoken.

Well, I'm not sure what the majority has said, actually. Remember this bug is "If X then Y", not "Y".

There is some tension between LO being a suite of apps people use for their daily tasks, and being an ODF file editor - and this attribute of documents may be a point at which this tension is felt:

* There officially is such a thing as an ODF document's language, so an ODF editor should show you what it is and let you set it.

* It's not clear whether a "document language" is a useful notion to burden the user with, so perhaps the UI should gloss over it

Let's try to resolve this tension somehow. Maybe... by examining the rationale in the ODF standard for the "document language" property.
Comment 19 Eyal Rozenberg 2023-12-06 21:36:53 UTC
(In reply to V Stuart Foote from comment #14)
> See ODF 1.3 4.2.3.15

I looked at ODF 1.3, and there is no § 4.2.3.15 there!

https://docs.oasis-open.org/office/OpenDocument/v1.3/os/part3-schema/OpenDocument-v1.3-os-part3-schema.html

Stuart?
Comment 20 V Stuart Foote 2023-12-06 22:59:14 UTC
(In reply to Eyal Rozenberg from comment #19)
> ...
> I looked at ODF 1.3, and there is no § 4.2.3.15 there!


It's there in the link given, "4.3.2.15 <dc:language>"
Comment 21 Eyal Rozenberg 2023-12-06 23:19:36 UTC
(In reply to V Stuart Foote from comment #20)
> It's there in the link given, "4.3.2.15 <dc:language>"

Your earlier comments said 4.2.3.15 . :-(

Anyway, the section doesn't provide any sort of rationale, nor do other references to dc:language. Too bad.