Bug Hunting Session
Bug 124108 - auto-correction of typographic quotation marks and apostrophes broken for fr_CI
Summary: auto-correction of typographic quotation marks and apostrophes broken for fr_CI
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Localization (show other bugs)
Version:
(earliest affected)
4.3.0.4 release
Hardware: All All
: medium normal
Assignee: Eike Rathke
URL:
Whiteboard: target:6.4.0 target:6.3.1
Keywords:
Depends on:
Blocks: AutoCorrect-Complete
  Show dependency treegraph
 
Reported: 2019-03-15 20:40 UTC by sommerluk
Modified: 2019-10-04 13:05 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description sommerluk 2019-03-15 20:40:21 UTC
Description:
Auto-correction of typographic quotation marks and apostrophes is broken for fr_CI

Steps to Reproduce:
In the character properties, set the text language to “French (Ivory Coast”. Than, type this in your document:

"C'est un test."

Actual Results:
Actual results: No auto-correction at all. The typed text remains unchanged.

Expected Results:
Expected result: An auto-correction identical the the auto-correction of “French (France)”: Typographic (french) quotation marks are used together with a non-break space, and a typographic apostrophe is used. It’s changed to:

« C’est un test. »


Reproducible: Always


User Profile Reset: No



Additional Info:
Comment 1 V Stuart Foote 2019-03-15 21:04:52 UTC
Not set up for this locale, so not confirming. But looking at source the Côte d'Ivoire fr_CI localedata [1] seems a bit thin, and modeled on Burkina Faso fr_BF [2] for some reason. But with no Markers stanzas, where should it pick up French quotation Guillemets--I'd assume the "fr" LangID--doesn't seem to be according to OP.

=-ref-=

[1] https://opengrok.libreoffice.org/xref/core/i18npool/source/localedata/data/fr_CI.xml

[2] https://opengrok.libreoffice.org/xref/core/i18npool/source/localedata/data/fr_BF.xml
Comment 2 Xisco Faulí 2019-04-16 09:21:34 UTC
*** Bug 122640 has been marked as a duplicate of this bug. ***
Comment 3 Xisco Faulí 2019-04-16 09:24:41 UTC
@Eike, any opinion here ?
Comment 4 Xisco Faulí 2019-07-05 10:23:31 UTC
@Sophie, I thought you might be interested in this issue...
Comment 5 Julien Nabet 2019-08-09 14:56:42 UTC
Just wonder if it's ok to have unoid="generic" + ref=...
I mean, reading the content of this of files, it seems that when there's unoid="generic", you must declare the details of LC_CTYPE, and when there's ref=... you rely entirely on the reference (except if you added some replace functions)

grepping code, I got 7 cases:
fr_BJ.xml:22:  <LC_CTYPE unoid="generic" ref="fr_BF" />
fr_CI.xml:22:  <LC_CTYPE unoid="generic" ref="fr_BF" />
fr_ML.xml:22:  <LC_CTYPE unoid="generic" ref="fr_BF" />
fr_NE.xml:22:  <LC_CTYPE unoid="generic" ref="fr_BF" />
fr_SN.xml:22:  <LC_CTYPE unoid="generic" ref="fr_BF" />
fr_TG.xml:22:  <LC_CTYPE unoid="generic" ref="fr_BF" />
pap_BQ.xml:22:  <LC_CTYPE unoid="generic" ref="pap_CW" />

Eike: any thoughts?
Comment 6 Eike Rathke 2019-08-12 17:14:01 UTC
The unoid="generic" is irrelevant and ignored. The only thing important here is the ref="fr_BF", which was added with commit f8408481819795517b2fae82b644f82b93ffd963 as the original commit 03ee242870502c41b1db8af50a580bf24bdb3d1f for bug 79348 was a complete duplication of fr_BF data.

fr_BF.xml in <LC_CTYPE> defines

    <Markers>
      <QuotationStart>'</QuotationStart>
      <QuotationEnd>'</QuotationEnd>
      <DoubleQuotationStart>"</DoubleQuotationStart>
      <DoubleQuotationEnd>"</DoubleQuotationEnd>
    </Markers>

which are U+0027 APOSTROPHE and U+0022 QUOTATION MARK, so no wonder there's no replacement. It probably would be worth to investigate all those French locale data with ref="fr_BF" if they shouldn't use something different as quotation marks, and if all use the same different characters then change that in fr_BF.xml, otherwise it would need individual changes.

As I have no idea what the language habits in those countries are I leave the definition up to someone familiar with it.
Comment 7 Julien Nabet 2019-08-12 20:37:32 UTC
Thank you Eike for your feedback.

David/sommerluk: do all these countries:
- fr_BF (Burkina Faso)
- fr_BJ (Bénin)
- fr_CI (Côte d'Ivoire)
- fr_ML (Mali)
- fr_NE (Niger)
- fr_SN (Sénégal)
- fr_TG (Togo)
have same quotes/double quotes, I mean:
‘ :for QuotationStart and QuotationEnd
« : for DoubleQuotationStart
» : for DoubleQuotationEnd
?
Replace these items in i18npool/source/localedata/data/fr_BF.xml is quick to do but I wouldn't like to bring some confusion.
Comment 8 sophie 2019-08-13 07:43:10 UTC
I'll investigate on my side too.
Comment 9 sommerluk 2019-08-13 16:25:45 UTC
To open a double quotation mark use the following sequence:

Either:
– U+00AB LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
– U+00A0 NO-BREAK SPACE [NBSP]

Or:
– U+00AB LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
– U+202F NARROW NO-BREAK SPACE [NNBSP]

To close a double quotation mark use the following sequence:

Either:
– U+00A0 NO-BREAK SPACE [NBSP]
– U+00BB RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK

Or:
– U+202F NARROW NO-BREAK SPACE [NNBSP]
– U+00BB RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK

This should be valid in all the countries you have listed.

> ‘ :for QuotationStart and QuotationEnd

If this is about second level citations or quotation marks, than: No, this is wrong. There are different opinions about how second level quotation marks should look like in French typography, but none of these opinions is about using U+0027 APOSTROPHE. It might be either “like this” or ‹ like this ›, see also https://fr.wikipedia.org/wiki/Guillemet#Double_ou_triple_niveau_de_citation. Also LibreOffice’s fr_FR.xml seems a little bit strange in this sense.

Anyway, for the normal apostrophe usage, always U+2019 RIGHT SINGLE QUOTATION MARK is the right choise.
Comment 10 Julien Nabet 2019-08-13 18:28:01 UTC
>...
> Either:
> – U+00AB LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
> – U+00A0 NO-BREAK SPACE [NBSP]
>...
Just for information, it seems the no-break spaces aren't managed by the quoted xml files.

Also, I don't think LO manages second level quotations but since I'm not an expert at all, I may be wrong.
Comment 11 sommerluk 2019-08-13 18:48:57 UTC
> Just for information, it seems the no-break spaces aren't managed by the quoted xml files.

Yes, apparently. For fr_FR it seems that the XML file has simply U+00AB LEFT-POINTING DOUBLE ANGLE QUOTATION MARK and U+00BB RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK, nevertheless when using LibreOffice Writer a U+00A0 NO-BREAK SPACE [NBSP] is correctly added at the interior (after opening and before closing quotation mark).
Comment 12 Eike Rathke 2019-08-14 17:59:54 UTC
fr_FR locale data has

 ‘ U+2018 LEFT SINGLE QUOTATION MARK
 ’ U+2019 RIGHT SINGLE QUOTATION MARK
 « U+00AB LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
 » U+00BB RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK

which have been in use since 2005. Given comment 9 I'll use the same for fr_BF and its derived locales.
Comment 13 Eike Rathke 2019-08-14 18:11:34 UTC
On the other hand, https://en.wikipedia.org/wiki/Quotation_mark#French (same as the French site but better readable for me ;-) for single quotes lists

 ‹ U+2039 SINGLE LEFT-POINTING ANGLE QUOTATION MARK
 › U+203A SINGLE RIGHT-POINTING ANGLE QUOTATION MARK

so I wonder why French speakers didn't request those so far.
Comment 14 Eike Rathke 2019-08-14 18:29:23 UTC
Significantly (now listed with fr_BF as well) we use the same for all fr_* locales where defined:

fr_BE.xml:      <QuotationStart>‘</QuotationStart>
fr_BE.xml:      <QuotationEnd>’</QuotationEnd>
fr_BE.xml:      <DoubleQuotationStart>«</DoubleQuotationStart>
fr_BE.xml:      <DoubleQuotationEnd>»</DoubleQuotationEnd>
fr_BF.xml:      <QuotationStart>‘</QuotationStart>
fr_BF.xml:      <QuotationEnd>’</QuotationEnd>
fr_BF.xml:      <DoubleQuotationStart>«</DoubleQuotationStart>
fr_BF.xml:      <DoubleQuotationEnd>»</DoubleQuotationEnd>
fr_CA.xml:      <QuotationStart>‘</QuotationStart>
fr_CA.xml:      <QuotationEnd>’</QuotationEnd>
fr_CA.xml:      <DoubleQuotationStart>«</DoubleQuotationStart>
fr_CA.xml:      <DoubleQuotationEnd>»</DoubleQuotationEnd>
fr_CH.xml:      <QuotationStart>‘</QuotationStart>
fr_CH.xml:      <QuotationEnd>’</QuotationEnd>
fr_CH.xml:      <DoubleQuotationStart>«</DoubleQuotationStart>
fr_CH.xml:      <DoubleQuotationEnd>»</DoubleQuotationEnd>
fr_FR.xml:      <QuotationStart>‘</QuotationStart>
fr_FR.xml:      <QuotationEnd>’</QuotationEnd>
fr_FR.xml:      <DoubleQuotationStart>«</DoubleQuotationStart>
fr_FR.xml:      <DoubleQuotationEnd>»</DoubleQuotationEnd>
fr_LU.xml:      <QuotationStart>‘</QuotationStart>
fr_LU.xml:      <QuotationEnd>’</QuotationEnd>
fr_LU.xml:      <DoubleQuotationStart>«</DoubleQuotationStart>
fr_LU.xml:      <DoubleQuotationEnd>»</DoubleQuotationEnd>
Comment 15 Julien Nabet 2019-08-14 18:49:27 UTC
(In reply to Eike Rathke from comment #13)
> On the other hand, https://en.wikipedia.org/wiki/Quotation_mark#French (same
> as the French site but better readable for me ;-) for single quotes lists
> 
>  ‹ U+2039 SINGLE LEFT-POINTING ANGLE QUOTATION MARK
>  › U+203A SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
> 
> so I wonder why French speakers didn't request those so far.
I never saw this kind of quotation in a French book but I'm not a big reader so I can't tell.

Sophie: reading http://monsu.desiderio.free.fr/atelier/chevrons.html, these seem used for "Français romand".
Also I noticed “ ” (unicode 201C and 201D) don't seem to be useable in LO (but perhaps I missed them).
Comment 16 Commit Notification 2019-08-14 20:50:53 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/+/21bffad6bc108f3dc5eee608bf702412a5fcb530%5E%21

Resolves: tdf#124108 localized quotation marks for [fr-BF] and derived

It will be available in 6.4.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 17 Eike Rathke 2019-08-14 22:56:25 UTC
Pending review https://gerrit.libreoffice.org/77478 for 6-3
Comment 18 Commit Notification 2019-08-14 23:51:52 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "libreoffice-6-3":

https://git.libreoffice.org/core/+/4cbc0ac530c8a9e843ba875708368109926aeccc%5E%21

Resolves: tdf#124108 localized quotation marks for [fr-BF] and derived

It will be available in 6.3.1.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 19 sommerluk 2019-10-03 16:27:02 UTC
Reopening.

I’ve tested this in

Version: 6.3.2.2 (x64)
Build-ID: 98b30e735bda24bc04ab42594c85f7fd8be07b9c
CPU-Threads: 4; BS: Windows 10.0; UI-Render: Standard; VCL: win; 
Gebietsschema: de-DE (de_DE); UI-Sprache: de-DE
Calc: threaded

and only half of the issue is fixed.

When typing

"C'est un test."

with locale fr_FR now the result is

«C’est un test.»

What is fixed:
- Typographic quotation marks are used
– Typographic apostrophe is used

What is still not fixed:
– After « and before » there should be a non-breaking space like in fr_FR.
Comment 20 V Stuart Foote 2019-10-03 17:41:23 UTC
(In reply to sommerluk from comment #19)
> 
> What is still not fixed:
> – After « and before » there should be a non-breaking space like in fr_FR.

Hmm, the Autocorrect localized options pick up the LC_CTYPE from the fr_FR.xml and includes just the single codepoints:

U+2018 -- ‘
U+2019 -- ’
U+00AB -- «
U+00BB -- »

the NBS, NNBS is not present there. Nor is it present in the autocorrect table for fr_FR emoji.po

Meaning if an NBS (U+00A0) or NNBS (U+202F) is being inserted it is being done for fr_FR locale in the sw edit engine. But I can't find it.

In any case, this now seems an enhancement. Please open that as a new issue, perhaps summary "Append/prepend NNBS (U+202F) to « » for correct typography of French language double quotation use" 

This should be closed fixed.