Bug Hunting Session
Bug 79276 - add acor-fr-XX.dat to all FR variants (CA-BE-CH...) to get the Autocorrect table populated for these locales
Summary: add acor-fr-XX.dat to all FR variants (CA-BE-CH...) to get the Autocorrect ta...
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Linguistic (show other bugs)
Version:
(earliest affected)
4.2.4.2 release
Hardware: Other Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: BSA target:4.4.0 target:4.3.1 target:...
Keywords:
Depends on:
Blocks:
 
Reported: 2014-05-26 21:02 UTC by maliktunga
Modified: 2014-09-06 17:17 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description maliktunga 2014-05-26 21:02:35 UTC
Problem description: I use the French (Canada) version of the language pack. I expect '...' to be converted to '…' and '---' to be converted to '—', just like in LibreOffice Writer on Ubuntu 14.04, but they aren't converted.

I have libreoffice-langpack-fr and autocorr-fr installed and I'm using Fedora 20.

Steps to reproduce:
1. Open LibreOffice Writer;
2. Type '...';
3. Press the Space key.

Current behavior: '...' stays as-is.

Expected behavior: '...' should be replaced with '…'

Note that this applies to all the whole list of autocorr-fr autocorrections, including 'coeur' which should be replaced with 'cœur'. I have never had any such problem with the Ubuntu version.

              
Operating System: Fedora
Version: 4.2.4.2 release
Comment 1 tommy27 2014-05-27 04:57:49 UTC
did you try to manually enter those autocorrect entries in the autocorrect replacement table? is it still not working?
Comment 2 maliktunga 2014-05-27 10:42:47 UTC
(In reply to comment #1)
> did you try to manually enter those autocorrect entries in the autocorrect
> replacement table? is it still not working?

Yes, and it does work with manual entries. However, I still consider this a bug, since autocorrection should work out of the box and because the list of default entries is very long...
Comment 3 tommy27 2014-05-27 11:31:35 UTC
@gilles
so the issue you complain about is that those autocorrect entries are not already present in the default French (canada) list, ok?

what about the French (france) and other french subsets? are those entries included or not?

@sophie
could you please take a look at this?
Comment 4 maliktunga 2014-05-27 11:37:15 UTC
(In reply to comment #3)
> @gilles
> so the issue you complain about is that those autocorrect entries are not
> already present in the default French (canada) list, ok?
> 
> what about the French (france) and other french subsets? are those entries
> included or not?
> 
> @sophie
> could you please take a look at this?

Correct. I'd like the entries found in DocumentList.xml, located inside /usr/share/autocorr/acor-fr-FR.dat, to be automatically detected and working, just as in Ubuntu.

The same behaviour is found with English (USA) and French (France), which are the two others I tried. I also have installed the libreoffice-langpack-en package.
Comment 5 tommy27 2014-05-27 15:12:22 UTC
changed summary notes.
I ask advise to a native french speaker to hear if it's worth it or not.
Comment 6 sophie 2014-05-27 15:52:32 UTC
Hi, in fact the real request is to add the acor-fr-FR.dat to all variants of fr so that you can use the Autocorrect function if you use French (Canada) or French (Belgium) or French (Switzeland) so that the AutoCorrect replacement table is populated. 
As a workaround, you can rename the file acor-fr-CA.dat and put it in the same directory, it should populate the Autocorrect table. 
I'll change the summary accordingly and set the issue to New.
@tommy27 thanks for the head up :)
Sophie
Comment 7 sophie 2014-05-27 16:11:31 UTC
A side note: that should be true for other locales too ;) Sophie
Comment 8 Julien Nabet 2014-05-27 17:38:23 UTC
extras/source/autotext/lang/fr/acor contains this:
DocumentList.xml
META-INF/	
mimetype
SentenceExceptList.xml
WordExceptList.xml
See http://opengrok.libreoffice.org/xref/core/extras/source/autotext/lang/fr/acor/

Sophi: just to be sure, should the whole "acor" directory be copy pasted for other locales?
Comment 9 Julien Nabet 2014-05-27 20:21:52 UTC
Andras: thought you might be interested, so put you in cc.
Comment 10 Andras Timar 2014-05-27 20:32:48 UTC
Please don't copy&paste anything. If all fr-FR autocorrections are good for other French locales, then a fallback mechanism would be better in the code, i.e. for example if fr-CA rules are not found, fall back to fr-FR.
Comment 11 Julien Nabet 2014-05-27 20:43:04 UTC
Andras: thank you for your quick feedback so I won't copy anything.
About the fallback system, is opengrok.libreoffice.org/xref/core/editeng/source/misc/acorrcfg.cxx#40 a good start?
Comment 12 maliktunga 2014-05-27 20:52:30 UTC
There's something I don't understand.

Here are all the files in /urs/share/autocorr :

acor_en-AG.dat  acor_en-DK.dat  acor_en-JM.dat  acor_en-TT.dat  acor_fr-CH.dat
acor_en-AU.dat  acor_en-GB.dat  acor_en-NA.dat  acor_en-US.dat  acor_fr-FR.dat
acor_en-BS.dat  acor_en-GH.dat  acor_en-NG.dat  acor_en-ZA.dat  acor_fr-LU.dat
acor_en-BW.dat  acor_en-HK.dat  acor_en-NZ.dat  acor_en-ZW.dat  acor_fr-MC.dat
acor_en-BZ.dat  acor_en-IE.dat  acor_en-PH.dat  acor_fr-BE.dat
acor_en-CA.dat  acor_en-IN.dat  acor_en-SG.dat  acor_fr-CA.dat

All acor-en-XX.dat files -- except acor-en-GB.dat -- are links to the acor-en-GB.dat archive. Same thing with acor-fr-XX.dat and acor-fr-FR.dat

The preinstalled autocorrections do not function with any language, not even British English nor France French. 

Sorry if it wasn't clear, but the preinstalled autocorrection does not work in Fedora LibreOffice. Not at all.
Comment 13 sophie 2014-05-28 11:32:51 UTC
(In reply to comment #8)
> extras/source/autotext/lang/fr/acor contains this:
> DocumentList.xml
> META-INF/	
> mimetype
> SentenceExceptList.xml
> WordExceptList.xml
> See
> http://opengrok.libreoffice.org/xref/core/extras/source/autotext/lang/fr/
> acor/
> 
> Sophi: just to be sure, should the whole "acor" directory be copy pasted for
> other locales?

I don't know what should be done, but to be more specific, if under Tools > Options > Language Settings > Languages > Locale setting, you choose for example French (Belgium), then this is the locale setting that you find in Autocorrect Options > Replacement table. In this case, the replacement list is empty, it's only populated if you choose French (France). 
In /opt/libreoffice/share/autocor, there is only an acor_fr-FR.dat, I don't know how to tell him to be aware of the other variants ;) I've found that renaming the file is a work around, but it's a bit difficult for our users. Sophie
Comment 14 sophie 2014-05-28 11:44:21 UTC
(In reply to comment #12)
> There's something I don't understand.
> 
> Here are all the files in /urs/share/autocorr :
> 
> acor_en-AG.dat  acor_en-DK.dat  acor_en-JM.dat  acor_en-TT.dat 
> acor_fr-CH.dat
> acor_en-AU.dat  acor_en-GB.dat  acor_en-NA.dat  acor_en-US.dat 
> acor_fr-FR.dat
> acor_en-BS.dat  acor_en-GH.dat  acor_en-NG.dat  acor_en-ZA.dat 
> acor_fr-LU.dat
> acor_en-BW.dat  acor_en-HK.dat  acor_en-NZ.dat  acor_en-ZW.dat 
> acor_fr-MC.dat
> acor_en-BZ.dat  acor_en-IE.dat  acor_en-PH.dat  acor_fr-BE.dat
> acor_en-CA.dat  acor_en-IN.dat  acor_en-SG.dat  acor_fr-CA.dat
> 
> All acor-en-XX.dat files -- except acor-en-GB.dat -- are links to the
> acor-en-GB.dat archive. Same thing with acor-fr-XX.dat and acor-fr-FR.dat
> 
> The preinstalled autocorrections do not function with any language, not even
> British English nor France French. 
> 
> Sorry if it wasn't clear, but the preinstalled autocorrection does not work
> in Fedora LibreOffice. Not at all.

I don't have those file in my installation, there is no variant for acor file than acor_fr-FR.dat. However, the variant exist for NL language, I've : acor_nl-BE.dat and acor_nl-NL.dat. 
Checked in Version: 4.3.0.0.beta1
Build ID: 2e39c7e59c8fc8b16a54c3d981dceef27fb0c07f but it's the same for 4.2.4.2 
Did you version came with your system or is one you downloaded from the LibreOffice website? Sophie
Comment 15 maliktunga 2014-05-28 11:45:28 UTC
(In reply to comment #13)
> (In reply to comment #8)
> > extras/source/autotext/lang/fr/acor contains this:
> > DocumentList.xml
> > META-INF/	
> > mimetype
> > SentenceExceptList.xml
> > WordExceptList.xml
> > See
> > http://opengrok.libreoffice.org/xref/core/extras/source/autotext/lang/fr/
> > acor/
> > 
> > Sophi: just to be sure, should the whole "acor" directory be copy pasted for
> > other locales?
> 
> I don't know what should be done, but to be more specific, if under Tools >
> Options > Language Settings > Languages > Locale setting, you choose for
> example French (Belgium), then this is the locale setting that you find in
> Autocorrect Options > Replacement table. In this case, the replacement list
> is empty, it's only populated if you choose French (France). 
> In /opt/libreoffice/share/autocor, there is only an acor_fr-FR.dat, I don't
> know how to tell him to be aware of the other variants ;) I've found that
> renaming the file is a work around, but it's a bit difficult for our users.
> Sophie

Wrong.

In Fedora, there is no /opt/libreoffice/

Changing the locale to French (France) in Tools > Options > Language Settings > Languages > Locale does not change anything. The replacement list in Autocorrect > Replace stays empty.

Renaming what file? And how?
Comment 16 maliktunga 2014-05-28 11:47:04 UTC
(In reply to comment #14)
> (In reply to comment #12)
> > There's something I don't understand.
> > 
> > Here are all the files in /urs/share/autocorr :
> > 
> > acor_en-AG.dat  acor_en-DK.dat  acor_en-JM.dat  acor_en-TT.dat 
> > acor_fr-CH.dat
> > acor_en-AU.dat  acor_en-GB.dat  acor_en-NA.dat  acor_en-US.dat 
> > acor_fr-FR.dat
> > acor_en-BS.dat  acor_en-GH.dat  acor_en-NG.dat  acor_en-ZA.dat 
> > acor_fr-LU.dat
> > acor_en-BW.dat  acor_en-HK.dat  acor_en-NZ.dat  acor_en-ZW.dat 
> > acor_fr-MC.dat
> > acor_en-BZ.dat  acor_en-IE.dat  acor_en-PH.dat  acor_fr-BE.dat
> > acor_en-CA.dat  acor_en-IN.dat  acor_en-SG.dat  acor_fr-CA.dat
> > 
> > All acor-en-XX.dat files -- except acor-en-GB.dat -- are links to the
> > acor-en-GB.dat archive. Same thing with acor-fr-XX.dat and acor-fr-FR.dat
> > 
> > The preinstalled autocorrections do not function with any language, not even
> > British English nor France French. 
> > 
> > Sorry if it wasn't clear, but the preinstalled autocorrection does not work
> > in Fedora LibreOffice. Not at all.
> 
> I don't have those file in my installation, there is no variant for acor
> file than acor_fr-FR.dat. However, the variant exist for NL language, I've :
> acor_nl-BE.dat and acor_nl-NL.dat. 
> Checked in Version: 4.3.0.0.beta1
> Build ID: 2e39c7e59c8fc8b16a54c3d981dceef27fb0c07f but it's the same for
> 4.2.4.2 
> Did you version came with your system or is one you downloaded from the
> LibreOffice website? Sophie

So many things are wrong with your bug analysis, sorry to be direct. I clearly mentioned that the bug only affected Fedora 20. Everything works just fine in Ubuntu.
Comment 17 sophie 2014-05-28 12:23:42 UTC
Version: 4.3.0.0.beta1
Build ID: 2e39c7e59c8fc8b16a54c3d981dceef27fb0c07f

sophie@sophie:/opt/libreofficedev4.3/share/autocorr$ ls
acor_af-ZA.dat  acor_fa-IR.dat  acor_lt-LT.dat  acor_sr-CS.dat
acor_bg-BG.dat  acor_fi-FI.dat  acor_mn-MN.dat  acor_sr-Latn-CS.dat
acor_ca-ES.dat  acor_fr-FR.dat  acor_nl-BE.dat  acor_sr-Latn-ME.dat
acor_cs-CZ.dat  acor_ga-IE.dat  acor_nl-NL.dat  acor_sr-Latn-RS.dat
acor_da-DK.dat  acor_hr-HR.dat  acor_pl-PL.dat  acor_sr-ME.dat
acor_de-DE.dat  acor_hu-HU.dat  acor_pt-BR.dat  acor_sr-RS.dat
acor_en-AU.dat  acor_is-IS.dat  acor_pt-PT.dat  acor_sv-SE.dat
acor_en-GB.dat  acor_it-IT.dat  acor_ro-RO.dat  acor_tr-TR.dat
acor_en-US.dat  acor_ja-JP.dat  acor_ru-RU.dat  acor_vi-VN.dat
acor_en-ZA.dat  acor_ko-KR.dat  acor_sk-SK.dat  acor_zh-CN.dat
acor_es-ES.dat  acor_lb-LU.dat  acor_sl-SI.dat  acor_zh-TW.dat

Works well under Debian 6 and Ubuntu 14.10 and locale fr_FR, fails with the other variants and from what I remember it was the same under Windows, this is inherited from OOo. Sophie
Comment 18 Yousuf Philips (jay) (retired) 2014-05-28 12:30:56 UTC
I installed LibO 4.2.4 through the deb gz file onto my Linux Mint box and the only acor_fr-*.dat file found in /opt/libreoffice4.2/share/autocorr was acor_fr-FR.dat. When opening the autocorrect for 'French (Canada)' the list was blank. So i opened the terminal and copied it as acor_fr-CA.dat and it now it works. LibO 4.3 also only has acor_fr-FR.dat.
Comment 19 tommy27 2014-05-28 12:34:24 UTC
(In reply to comment #17)
> Version: 4.3.0.0.beta1
> Build ID: 2e39c7e59c8fc8b16a54c3d981dceef27fb0c07f
> 
> ...
>
> Works well under Debian 6 and Ubuntu 14.10 and locale fr_FR, fails with the
> other variants and from what I remember it was the same under Windows, this
> is inherited from OOo. Sophie

confirm the same acor file are udner "Bin\LibreOffice 4\share\autocorr" in Windows and there's only FR-fr
Comment 20 Julien Nabet 2014-05-28 20:25:41 UTC
Just wonder if it's really a core LO bug (and we'd need a failback system as Andras indicated) or if it's a packaging bug (since it works out-of-the box on some distribs - Debian or "from Debian" like Ubuntu).
Comment 21 maliktunga 2014-05-28 20:45:43 UTC
I'm using the LibreOffice Writer standalone from the Fedora official repository provided out-of-the-box with Fedora 20. I'm not using the official downloaded suite since I don't want anything but LibreOffice Writer.

Version: 4.2.4.2
Build ID: 4.2.4.2-8.fc20
'This release was supplied by The Fedora Project'
Comment 22 sophie 2014-05-29 10:41:35 UTC
(In reply to comment #20)
> Just wonder if it's really a core LO bug (and we'd need a failback system as
> Andras indicated) or if it's a packaging bug (since it works out-of-the box
> on some distribs - Debian or "from Debian" like Ubuntu).

Seems that Fedora provided a fallback by adding the files with their variants in the /user/ location where it's an empty directory in .config/libreoffidev/4/user/autocor/ in my install for 4.2.4.2 and 4.3.0.0. 
In my 4.1.6.1 install, this directory contains acor_fr-FR.dat, so from what I see, it seems to have changed since 4.2.x.x branches.
Comment 23 maliktunga 2014-05-29 10:50:57 UTC
(In reply to comment #22)
> (In reply to comment #20)
> > Just wonder if it's really a core LO bug (and we'd need a failback system as
> > Andras indicated) or if it's a packaging bug (since it works out-of-the box
> > on some distribs - Debian or "from Debian" like Ubuntu).
> 
> Seems that Fedora provided a fallback by adding the files with their
> variants in the /user/ location where it's an empty directory in
> .config/libreoffidev/4/user/autocor/ in my install for 4.2.4.2 and 4.3.0.0. 
> In my 4.1.6.1 install, this directory contains acor_fr-FR.dat, so from what
> I see, it seems to have changed since 4.2.x.x branches.

I removed the standalone, and instead I installed the official suite from the LibreOffice website. It now works! This confirms the fact it has something to do with Fedora's packages, and not with the official packages.
Comment 24 Julien Nabet 2014-06-01 21:43:23 UTC
Andras: I think the part to change is in SvxAutoCorrect::GetAutoCorrFileName
see http://opengrok.libreoffice.org/xref/core/editeng/source/misc/svxacorr.cxx#1888
   1888 OUString SvxAutoCorrect::GetAutoCorrFileName( const LanguageTag& rLanguageTag,
   1889                                             bool bNewFile, bool bTst ) const
   1890 {
   1891     OUString sRet, sExt( rLanguageTag.getBcp47() );
   1892 
   1893     sExt = "_" + sExt + ".dat";
   1894     if( bNewFile )
   1895         ( sRet = sUserAutoCorrFile )  += sExt;
   1896     else if( !bTst )
   1897         ( sRet = sShareAutoCorrFile )  += sExt;
   1898     else
   1899     {
   1900         // test first in the user directory - if not exist, then
   1901         ( sRet = sUserAutoCorrFile ) += sExt;
   1902         if( !FStatHelper::IsDocument( sRet ))
   1903             ( sRet = sShareAutoCorrFile ) += sExt;
   1904     }
   1905     return sRet;
   1906 }

Indeed, sExt retrieves "_fr-BE.dat" (if locale = Fr Belgium).
Some questions now:
1) How should be the fallback system if there would be "fr-CA" and "fr-FR" (so not only "fr-FR")? I mean if I use "fr-BE" and the fallback system takes the first "fr-XX" present, it could be any of them.
2) for a new en-XX, what would be the fallback en-US, en-GB, other?
3) Should we hardcode this on this function or should we create some sort of customization file?
Comment 25 Julien Nabet 2014-06-16 20:47:56 UTC
Sophie: discussing with Andras on Irc, he asked if having a locale independant fr dictionary could be possible. Even if I'm French, I don't know about the language and the other French language countries to tell if it's possible or not.
Any idea?
Comment 26 tommy27 2014-06-16 23:28:05 UTC
see this: Bug 44580 - share autocorrect replacement table for misc. language subgroups
Comment 27 Julien Nabet 2014-06-17 05:28:19 UTC
Thank you Tommy for the link :)

So for non localized, we should take the common part of all variants.
Since there's only 1 localized for "fr" and "it", we could just rename the .dat (with a git mv to not lose the file history)  then hack editeng/source/misc/svxacorr.cxx

If ok with this, I could give it a try.
Andras: are there other things to take into account?
Comment 28 sophie 2014-06-17 07:51:18 UTC
Hi Julien, Andras, there is no spelling differences in the other FR locales, only differences on the meaning (savoir vs pouvoir in Belgium for example). So there could be only one file shared by all if this is what you mean. Be aware that Eike has added several locales for FR in 4.3: [fr-CI], [fr-ML], [fr-SN], [fr-BJ], [fr-NE], [fr-TG] - Sophie
Comment 29 Julien Nabet 2014-06-18 19:19:21 UTC
Andras: as you may have seen ,I submitted a patch for review, see https://gerrit.libreoffice.org/#/c/9825/1
As I put in comment of this, I consider it as a draft since I don't know how to test it well (and hadn't a full understanding of the places which call "GetAutoCorrFileName" method)
Hope I didn't miss too many things :-(
Comment 30 Commit Notification 2014-07-03 12:45:17 UTC
Julien Nabet committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=82f291d2f7630938ce6ca740f904cab07d1ff90d

Resolves fdo#79276 Add fallback system for autocorrection of French variants



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 31 Commit Notification 2014-07-05 13:47:01 UTC
Julien Nabet committed a patch related to this issue.
It has been pushed to "libreoffice-4-3":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=a5f36fd02ad88af4bab74a074e676cf239e15d14&h=libreoffice-4-3

Resolves fdo#79276 Add fallback system for autocorrection of French variants


It will be available in LibreOffice 4.3.1.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 32 Commit Notification 2014-07-05 14:14:59 UTC
Julien Nabet committed a patch related to this issue.
It has been pushed to "libreoffice-4-2":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=1f407caeb3f6e2605973458a545547ed82e7da1c&h=libreoffice-4-2

Resolves fdo#79276 Add fallback system for autocorrection of French variants


It will be available in LibreOffice 4.2.6.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 33 tommy27 2014-08-23 06:14:53 UTC
Julien, your patch works fine for French languages.
I wonder if the same thing could be applied to german and spanish languages as well since they are currently in the same situation French was.

there's a single autocorrect list acor_de-DE.dat and acor_es-ES.dat which is limited just to Germany and Spain variants whilst there are multiple variants unconvered.
Comment 34 Julien Nabet 2014-08-23 08:52:36 UTC
Tommy27: indeed it could be applied the same way.
There are 2 locations to take into account:
1) extras/CustomTarget_autocorr.mk
eg:
- fr:fr-FR \
+ fr:fr \

2) extras/Package_autocorr.mk
- acor_fr-FR.dat \
+ acor_fr.dat \

BTW, italian language is in the same case too, only acor_it-IT.dat where as there's italian Italy and italian Switzerland

I created fdo#82985 for this (put in See Also)
Comment 35 Yousuf Philips (jay) (retired) 2014-09-05 20:33:14 UTC
should a fix be used for all languages and not just french, as someone on twitter just said he has to set his to english-US because english-IN doesnt have a check mark on it.
Comment 36 Julien Nabet 2014-09-05 20:39:33 UTC
(In reply to comment #35)
> should a fix be used for all languages and not just french, as someone on
> twitter just said he has to set his to english-US because english-IN doesnt
> have a check mark on it.
Jay, it could be done quite easily since there's now a generic mechanism, see http://cgit.freedesktop.org/libreoffice/core/commit/?id=a4f411ba62d4fd7fd4a61d1c9d326488d5e84ff5 for fdo#82985
The only problem is what is the reference English: gb? us? other?

The first thing you should do is submit a new bugtracker, IMHO no need to add more comments to this one :-)
Comment 37 Yousuf Philips (jay) (retired) 2014-09-06 13:52:32 UTC
(In reply to comment #36)
> Jay, it could be done quite easily since there's now a generic mechanism,
> see
> http://cgit.freedesktop.org/libreoffice/core/commit/
> ?id=a4f411ba62d4fd7fd4a61d1c9d326488d5e84ff5 for fdo#82985

Glad to hear that it isnt a hard thing to fix.

> The only problem is what is the reference English: gb? us? other?

English US would be the default, as thats the internal language used by LibO.

> The first thing you should do is submit a new bugtracker, IMHO no need to
> add more comments to this one :-)

New bug created (bug 83561). Thanks Julien. :D