Bug 119723 - Terms Used for Formatting Marks are different to the Terms Used in Unicode Standard or Wikipedia
Summary: Terms Used for Formatting Marks are different to the Terms Used in Unicode St...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Documentation (show other bugs)
(earliest affected)
3.4.5 release
Hardware: All All
: medium normal
Assignee: Not Assigned
Depends on:
Blocks: Formatting-Mark Help-Changes-Features
  Show dependency treegraph
Reported: 2018-09-06 11:54 UTC by Harald Koester
Modified: 2019-08-14 13:11 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:
Regression By:

ODT inserting formatting characters and a screenshot with German LO GUI (27.91 KB, application/vnd.oasis.opendocument.text)
2019-08-03 18:44 UTC, Svante Schubert

Note You need to log in before you can comment on or make changes to this bug.
Description Harald Koester 2018-09-06 11:54:17 UTC
This bug report is a proceeding of bug 45588. I will close 45588, when this new bug is confirmed. Bug 107447 is a subset of this bug. I will mark 107447 as a duplicate of this bug.

There are 7 different formatting marks which can be inserted by the menu (Insert > Formatting Mark) in documents. A few of this terms used in that menu are different to the Unicode Standard.

Unicode:                        Used in LibreOffice:
No-break space (U+00A0)         Non-breaking space (not OK)
Non-breaking hyphen (U+2011)    Non-breaking hyphen (OK)
Soft hyphen (U+00AD)            Soft hyphen (OK)
Zero width non-joiner (U+200C)  No-width optional break* (not OK)
Zero width joiner (U+200D)      No-width no break (not OK)
Left-to-right mark (U+200E)     Left-to-right mark (OK)
Right-to-left mark (U+200E)     Right-to-left mark (OK)

(*) The character “No-width optional break” (U+FEFF) is deprecated in the Unicode standard.

Expected: Use of Unicode terms. Also in options dialogue (Writer menu: Tools > Options… > LibreOffice Writer > Formatting Aids) the correct terms should be used.

Official translations of the Unicode Standard to other languages don't exist. So there is the problem which terms should be used in localisations of LibreOffice. I think to use the terms used in the Wikipedia, if they exist there, is the best solution.

In German these terms are used in the Wikipedia:

Unicode:                        German:
No-break space (U+00A0)         Geschütztes Leerzeichen (OK)
Non-breaking hyphen (U+2011)    Geschützter Bindestrich (OK)
Soft hyphen (U+00AD)            Weiches Trennzeichen (OK)
Zero width non-joiner (U+200C)  Bindehemmer (not OK)
Zero width joiner (U+200D)      Breitenloser Verbinder (not OK)
Left-to-right mark (U+200E)     Links-nach-rechts-Zeichen (not OK)
Right-to-left mark (U+200E)     Rechts-nach-links-Zeichen (not OK)

Expected: For German localisation use of the terms in the Wikipedia in the menu (Einfügen > Formatierungszeichen).

Also in the Help the Unicode terms respectively the terms in the Wikipedia should be used. I found the following help pages where the terms are not correct.

English help:

(1) Page “Inserting Non-breaking Spaces, Hyphens and Soft Hyphens“

(2) Page “Formatting Aids”

(3) Page “Formatting Mark”

(4) Page “Formatting aids”

German help: 

Page “Formatierungszeichen”
Comment 1 Harald Koester 2018-09-06 11:58:12 UTC
*** Bug 107447 has been marked as a duplicate of this bug. ***
Comment 2 خالد حسني 2018-09-07 12:04:20 UTC

*** This bug has been marked as a duplicate of bug 107447 ***
Comment 3 Harald Koester 2018-09-10 20:55:44 UTC
According Khaled Hosny in bug 107447 LibreOffice does not use the characters 'Zero width non-joiner' and 'Zero width joiner'. Correct should be:

Unicode:                        Used in LibreOffice:
...                             ... 
ZERO WIDTH SPACE (U+200B)       No-width optional break* (not OK)
WORD JOINER (U+2060)            No-width no break (not OK)
...                             ...

The respective German terms are:

...                             ... 
ZERO WIDTH SPACE (U+200B)       Breitenloses Leerzeichen (not OK)
WORD JOINER (U+2060)            Wortverbinder (not OK)
...                             ...


(a) It may be wise to add the old terms in brackets to the new terms, hence AFAIK Microsoft uses the same old terms for 'No-width optional break' and 'No-width no break'.

(b) In the dialogue Special Characters all terms used for the caption of the preview of a character are not localised. I'm not sure if this captions should be localised, because you have to translate thousands of terms.
Comment 4 Harald Koester 2018-09-10 21:08:10 UTC
*** Bug 107447 has been marked as a duplicate of this bug. ***
Comment 5 Harald Koester 2018-09-11 10:40:02 UTC
I found some more sites in the help, where wrong terms are used. This is currently the complete list:

English pages:

(a) Page “Inserting Non-breaking Spaces, Hyphens and Soft Hyphens”
Wrong: Non-breaking space, correct: No-Break Space

(b) Page “Formatting Aids”
Wrong: Non-breaking spaces, correct: No-Break Space

(c) Page: ”Formatting Mark”
Wrong: Non-breaking space, No-width optional break, No-width no break
Correct: No-Break Space, Zero Width Space, Word Joiner

(d) Page “Preventing Hyphenation of Specific Words”
Wrong: No-width no break. Correct: Word Joiner

(e) Page “Hyphenation”:
Inserting of Soft Hyphen is described as manual hyphenation.

German pages:

(f) Page “Ausschalten der Silbentrennung für bestimmte Wörter”:
Wrong: Verbindungszeichen ohne Breite, Correct respective Wikipedia: Wortverbinder

(g) Page “Formatierungszeichen”
Wrong: Leerzeichen ohne Breite, Verbindungszeichen ohne Breite, Links nach rechts-Markierung, Rechts nach links Markierung
Correct respective Wikipedia: Breitenloses Leerzeichen, Wortverbinder, Links-nach-rechts-Zeichen, Rechts-nach-links-Zeichen
Comment 6 Svante Schubert 2019-08-03 18:44:21 UTC
Created attachment 153120 [details]
ODT inserting formatting characters and a screenshot with German LO GUI

    • 'zero-width' should not be layouted similar to a comma
    • 'zero-width' should be named instead in German 'Breitenloses Leerzeichen'
    • 'word joiner' should be named instead in German 'Nullbreite'

by LO Version: (x64)
Build-ID: 5896ab1714085361c45cf540f76f60673dd96a72
CPU-Threads: 8; BS: Windows 10.0; UI-Render: Standard; 
Gebietsschema: de-DE (de_DE); Calc: CL

Screenshots within the document! :-)

Comment 7 Harald Koester 2019-08-14 12:32:53 UTC
(In reply to Svante Schubert from comment #6)

>     • 'zero-width' should not be layouted similar to a comma
>     • 'zero-width' should be named instead in German 'Breitenloses
> Leerzeichen'
I suppose you mean the formatting mark "Zero Width Space" (U+200B). In this case the term "Breitenloses Leerzeichen" is correct. Currently LibreOffice (version 6.3.0) uses the term "No-width Optional Break" in English and "Weicher Umbruch ohne Breite" in German. Both are not correct.

>     • 'word joiner' should be named instead in German 'Nullbreite'
For the "Word Joiner" (U+2060) LibreOffice currently (version 6.3.0) uses the terms "No-width no Break" in English and "Verbindungszeichen ohne Breite" in German. In the German Wikipedia there is no seperate article for the Word Joiner. But in 
the term "Wortverbinder" is used. Hence I proposed this term in comment 3. In the Internet I did not find the term "Nullbreite" as a translation of the "Word Joiner". Hence I still propose to use the term "Wortverbinder".