Bug 96787 - AutoCorrect: After Removal of Replacement Entry the Replacement Itself is still Performed.
Summary: AutoCorrect: After Removal of Replacement Entry the Replacement Itself is sti...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Linguistic (show other bugs)
Version:
(earliest affected)
4.4.7.2 release
Hardware: x86-64 (AMD64) All
: medium normal
Assignee: Julien Nabet
URL:
Whiteboard: target:7.2.0 target:7.1.0.2 target:7.0.5
Keywords: bibisectRequest, needUITest, regression
Depends on:
Blocks: AutoCorrect-Complete
  Show dependency treegraph
 
Reported: 2015-12-29 12:54 UTC by Benjamin Quest
Modified: 2021-02-24 20:01 UTC (History)
10 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Benjamin Quest 2015-12-29 12:54:11 UTC
Hi,
I do want to remove the autocorrection of the (c) to the copyright symbol. 

Therefore I went to Autocorrection settings, replacement table and removed the *.(c) entry from "the list".

--> "the list" in this case is from EVERY list, i.e. Deutsch (Deutschland), Deutsch (Belgien), Deutsch (...). So, none of the lists has the entry *.(c) 

I verified that the language I am actually writing in is set to Deutsch (Deutschland)(Character, paragraph, whole document, does not matter).

STILL: every time I type (c) it gets autocorrected to the copyright symbol.

Version: 5.0.4.2
Build-ID: 2b9802c1994aa0b7dc6079e128979269cf95bc78
Gebietsschema: de-DE (de_DE.UTF-8)

Google suggests that it would have been mattered that the replacement table must match to the actual language used (or set) while typing. Apparently this does not work for me.

I'll have to write (c) (and do not mean the copyright symbol) all the time, so currently using Ctrl+z all the time ... :-(

Funnily, if I add the copyright symbol to the replacement list and let it be autocorrected to (c), this works, so instead of typing Ctrl+Z I can hit Del and Space and get the autocorrected (c)opyright autocorrected into just (c).
Comment 1 tommy27 2015-12-30 08:26:30 UTC
Did you check the (all) autocorrect list?  It's on top of the autocorrect language list
Comment 2 tommy27 2015-12-30 08:28:05 UTC
Please tell your linux distro and version
Comment 3 Benjamin Quest 2015-12-30 10:37:16 UTC
HI,
yes I tried the [all] list on top of the language list. It was empty, so there was nothing to remove.

LInux distribution is Linux Mint 17.2 KDE (i.e. Ubuntu 14.04 base). KDE Systems settings is set to Deutschland, preferred language (and also the only one) is Deutsch.

locale yields:

:~ > locale
LANG=de_DE.UTF-8
LANGUAGE=
LC_CTYPE="de_DE.UTF-8"
LC_NUMERIC="de_DE.UTF-8"
LC_TIME="de_DE.UTF-8"
LC_COLLATE="de_DE.UTF-8"
LC_MONETARY="de_DE.UTF-8"
LC_MESSAGES="de_DE.UTF-8"
LC_PAPER="de_DE.UTF-8"
LC_NAME="de_DE.UTF-8"
LC_ADDRESS="de_DE.UTF-8"
LC_TELEPHONE="de_DE.UTF-8"
LC_MEASUREMENT="de_DE.UTF-8"
LC_IDENTIFICATION="de_DE.UTF-8"
LC_ALL=

The odd thing is, LO uses the replacement table if I define own autocorrections (as the {copyright symbol} to {(c)} one ...).
Comment 4 tommy27 2015-12-30 19:25:32 UTC
did you try removing the *.(c) entry from english autocorrect lists as well?

I remember a bug (I cannot recall the exact number right now) where there was an unwanted english i to I autocorrect even in non-english documents
Comment 5 Benjamin Quest 2015-12-30 20:49:50 UTC
OK I have removed the autocorrection entry for *.(c) or *.(C) from all English dictionaries. --> Problem persists

Went on an deleted the corresponding entries in all French dictionaries: --> Problem persists

I also checked the "Standard - Deutsch (Deutschland)" which is in the list under the letter 'S' (where *D*eutsch would not right away be expected ;-) in that list I had already removed the *.(c) entry and inserted the reverse autocorrection for testing.

I am not sure whether continuing with the Spanish/Italian ones is helpful?

Copyright autocorrection seems "hardcoded".
Comment 6 tommy27 2015-12-30 21:45:33 UTC
mmmh... did you try resetting the user profile?
https://wiki.documentfoundation.org/UserProfile
Comment 7 Benjamin Quest 2015-12-30 22:30:22 UTC
This is what I did now:
Closed LO
Went to /home/<username>/.config/libreoffice/4/
Renamed the folder /user to /user_bckp
Started LO (fresh /user folder is generated) now with default settings.
Verified that Language for the paragraph and character is set to Deutsch.
Went to Autocorrection list to remove .*(c) (acknowledge with OK)

And ... :drumroll: ... problem persists.

Closed LO and switched back to user backup folder.
Comment 8 tommy27 2015-12-31 07:50:48 UTC
please go under Tools/Language Setting/Languages and tell your setup about: 
- user interface
- locale
- default language for documents
Comment 9 Benjamin Quest 2015-12-31 11:12:52 UTC
Information found under "Extras"/"Optionen"/"Spracheinstellungen":
UI set to:  Standard - Deutsch (Deutschland)
locale set to:  Standard - Deutsch (Deutschland)
Default lang. f. documents set to: Standard - Deutsch (Deutschland)
Comment 10 tommy27 2016-01-01 18:31:36 UTC
are you sure you don't have any other special character software in your Linux computer that may interfere with LibreOffice?

otherwise I have no more ideas...

I can't reproduce your bug under Windows 8.1 x64
using LibO 5.0.3.1 and recent 5.2.0.0 alpha

another Linux tester is needed to replicate your issue.
Comment 11 Buovjaga 2016-01-04 11:23:45 UTC
Not reproduced.

Changed lang of document to German (Germany). (c) was autocorrected.
Removed the autocorrection rule for (c) in German (Germany). (c) was not autocorrected anymore.

Ubuntu 15.10 64-bit 
Version: 5.0.3.2
Build ID: 1:5.0.3~rc2-0ubuntu1
Locale: en-US (en_US.UTF-8)
Comment 12 Benjamin Quest 2016-01-04 13:55:34 UTC
I can reproduce the following:
Changed the Languagge of the entire Text to English (GB) (for which I hat removed (c) from the autocorrection list in the earlier attempts):

--> (c) is not autocorrected.

So switching from Standard Deutsch (Deutschland) for the entire text to English (GB), where (c) has been removed from the autocorrection list let LO behave just as expected.

However, changing the Language of the entire text back to German (where in all lists (c) has equally been removed):

--> (c) is again autocorrected to the copyright symbol.

Beluga: does the autocorrection removal of (c) work in your Default language?
Comment 13 Buovjaga 2016-01-04 14:02:11 UTC
(In reply to Benjamin Quest from comment #12)
> Beluga: does the autocorrection removal of (c) work in your Default language?

English doesn't seem to have the autocorrection rule.
Comment 14 Oliver Specht (CIB) 2016-03-11 14:41:56 UTC
The problem is that there is an autocorrection for German(Germany) in two versions - in the share directory there is acor_de.dat and in the user directory there is acor_de-DE.dat

If a the search in the language/country combination file is not found then the auto correction looks for the pure language version. 

If you delete this acor_de-DE.dat in your user configuration and call the AutoCorrect options dialog to remove the (c) from the German list then the 
acor_de.dat file is loaded and a new acor_de-DE.dat is created. 

There is no acor-en.dat in the share directory.

The installation should probably create a acor_de-DE.dat instead of acor_de.dat.
Comment 15 Harald Koester 2016-10-13 20:52:19 UTC
(In reply to Oliver Specht (CIB) from comment #14)

> If you delete this acor_de-DE.dat in your user configuration and call the
> AutoCorrect options dialog to remove the (c) from the German list then the 
> acor_de.dat file is loaded and a new acor_de-DE.dat is created. 

As far as I see this problem, there is no possibility to remove the "(c)" from the general German list. You can only delete entries from lists respective to the different countries.

The bug does not only occur with "(c)" but with every entry of the replacement lists. Hence summary changed.

Another use case where the deletion fails are wrong entries in the default lists. In bug 103156 there are some examples for German (Switzerland).

The replacement table is used for all modules of LibreOffice (Writer, Calc,...), hence component changed to 'Linguistic'.

Bug exists already in version 3.3.0. Hence inherited form OOo.

Used version: 5.2.2, Win7.
Comment 16 QA Administrators 2017-10-23 14:02:13 UTC Comment hidden (obsolete)
Comment 17 Benjamin Quest 2017-10-23 20:06:03 UTC
Bug is still reproducible in LO 5.4.1.2 (Manjaro stable default package). Once Manjaro stable updates to 5.4.2 I'll report again.
Comment 18 Telesto 2018-08-09 13:57:00 UTC
*** Bug 119177 has been marked as a duplicate of this bug. ***
Comment 19 Tyco72 2019-02-21 08:00:59 UTC
The bug is still open in LO 6.1.5.2 also in Windows7 (Feb. 2019)

It seems that LO uses always the acor_xx.dat files located in the path:

"C:\Program Files\LibreOffice\share\autocorr" (for Windows7)

instead of the new acor files which are created in the profile path:

"D:\Daten\Users\%username%\AppData\Roaming\LibreOffice\4\user\autocorr" 
These new autocorr files are created when you apply changes for the first time in the autocorr replacement table, for a specific language.

How is it possible that after 3 years a such annoying bug has still not an assignee?
Comment 20 Telesto 2021-01-04 09:38:14 UTC
(In reply to Harald Koester from comment #15)
> Bug exists already in version 3.3.0. Hence inherited form OOo.

Does it? Following the steps of bug 139396 it appears to work with 
Version: 4.3.7.2
Build ID: 8a35821d8636a03b8bf4e15b48f59794652c68ba

Versie: 4.2.0.4 
Build ID: 05dceb5d363845f2cf968344d7adab8dcfb2ba71

and also fine in 4.0


Adding bibisectrequest based on that assumption
Comment 21 Julien Nabet 2021-01-09 19:45:18 UTC
On pc Debian x86-64 with master sources updated today, I could reproduce this.

I tried with the example of tdf#139377 where Dirk (I put in cc) would like to avoid LO from replacing "daß" by "dass".
After deleting the entry, LO still does the replacement even after having restarted LO.

If we take the example of German language from Germany, the initial replacement file (ie with a brand new LO profile) is "acor_de.dat" present in "share" directory.

Just for information, these dat files are in fact zip files with a specific structure, see:
julien@debianamd:/tmp/test$ ls
acor_de.dat
julien@debianamd:/tmp/test$ mv acor_de.dat acor_de.zip
julien@debianamd:/tmp/test$ unzip acor_de.zip 
Archive:  acor_de.zip
 extracting: mimetype                
  inflating: DocumentList.xml        
  inflating: META-INF/manifest.xml   
  inflating: SentenceExceptList.xml  
  inflating: WordExceptList.xml      


When deleting an entry in the autocorrect dialog and click Ok, LO generates
a "acor_de-DE.dat" in "user" directory this time.
This "acor_de-DE.dat" contains all the replacements from "acor_de.dat" except the deleted entry.
Important thing to note: "acor_de.dat" in "share" doesn't change and still contains all replacements (included "daß"=>"dass")

Remark: the extra "-DE" in the file name is because I used Germany locale. If I switch to Austria locale for example, it will use initial "acor_de.dat" from "share" so it won't use "acor_de-DE.dat" from "user" directory)

After some debugging, I noticed that the type of replacement "daß"->"dass" was done in SvxAutoCorrect::SearchWordsInList
(see https://opengrok.libreoffice.org/xref/core/editeng/source/misc/svxacorr.cxx?r=40e98c87#1920).

1) it begins in the block:
   1932     // First search for eLang, then US-English -> English
   1933     // and last in LANGUAGE_UNDETERMINED
   1934     if (m_aLangTable.find(aLanguageTag) != m_aLangTable.end() || CreateLanguageFile(aLanguageTag, false))
   1935     {
   1936         //the language is available - so bring it on
   1937         std::unique_ptr<SvxAutoCorrectLanguageLists> const& pList = m_aLangTable.find(aLanguageTag)->second;
   1938         pRet = lcl_SearchWordsInList( pList.get(), rTxt, rStt, nEndPos );
   1939         if( pRet )
   1940         {
   1941             rLang = aLanguageTag;
   1942             return pRet;
   1943         }
   1944     }

gdb showed that pList.get() used "acor_de-DE.dat" (from "user") and when searching "daß", it doesn't enter if block at line 1939 as expected.

2) But then LO keeps in the next block:
   1946     // If it still could not be found here, then keep on searching
   1947     LanguageType eLang = aLanguageTag.getLanguageType();
   1948     // the primary language for example EN
   1949     aLanguageTag.reset(aLanguageTag.getLanguage());
   1950     LanguageType nTmpKey = aLanguageTag.getLanguageType(false);
   1951     if (nTmpKey != eLang && nTmpKey != LANGUAGE_UNDETERMINED &&
   1952                 (m_aLangTable.find(aLanguageTag) != m_aLangTable.end() ||
   1953                  CreateLanguageFile(aLanguageTag, false)))
   1954     {
   1955         //the language is available - so bring it on
   1956         std::unique_ptr<SvxAutoCorrectLanguageLists> const& pList = m_aLangTable.find(aLanguageTag)->second;
   1957         pRet = lcl_SearchWordsInList( pList.get(), rTxt, rStt, nEndPos );
   1958         if( pRet )
   1959         {
   1960             rLang = aLanguageTag;
   1961             return pRet;
   1962         }
   1963     }

and here pList.get() shows "acor_de.dat" from "share".
So when searching "daß", it finds the replace (since "acor_de.dat" isn't changed when deleting an entry)

I'm gonna propose this straightforward patch:
diff --git a/editeng/source/misc/svxacorr.cxx b/editeng/source/misc/svxacorr.cxx
index ae6dceb33adf..0f048114462b 100644
--- a/editeng/source/misc/svxacorr.cxx
+++ b/editeng/source/misc/svxacorr.cxx
@@ -1941,6 +1941,8 @@ const SvxAutocorrWord* SvxAutoCorrect::SearchWordsInList(
             rLang = aLanguageTag;
             return pRet;
         }
+        else
+            return nullptr;
     }
 
     // If it still could not be found here, then keep on searching

If a correspond acor file is found in user, it goes into 1) like previously but instead of keeping on with 2), if there's no replacement found, it returns nullptr.

I tested this patch with a brand new LO profile (so without "acor_de-DE.dat"), the replacement works since we don't enter the first if of 1) and so we go into 2).
Comment 22 Julien Nabet 2021-01-09 19:53:40 UTC
Patch waiting for review here https://gerrit.libreoffice.org/c/core/+/109039 for master branch.
Comment 23 Commit Notification 2021-01-10 08:39:17 UTC
Julien Nabet committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/ae56dc05b27f05ffcee99845d661a237e70a7a51

tdf#96787: AutoCorrect: after deleting a replacement entry, it's still used

It will be available in 7.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 24 Commit Notification 2021-01-10 12:38:09 UTC
Julien Nabet committed a patch related to this issue.
It has been pushed to "libreoffice-7-1":

https://git.libreoffice.org/core/commit/8f2f805b07ddb3450b48e32dc7171625d81d8a84

tdf#96787: AutoCorrect: after deleting a replacement entry, it's still used

It will be available in 7.1.0.2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 25 Commit Notification 2021-01-10 12:39:24 UTC
Julien Nabet committed a patch related to this issue.
It has been pushed to "libreoffice-7-0":

https://git.libreoffice.org/core/commit/6046b52ba01dc7dcde4140973c06bec19dadf2a3

tdf#96787: AutoCorrect: after deleting a replacement entry, it's still used

It will be available in 7.0.5.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 26 Julien Nabet 2021-01-10 13:00:00 UTC
Adolfo pushed cherry-pick commits for 7.1. and 7.0 branches.
Let's put this one to FIXED then.
Comment 27 Telesto 2021-01-11 12:12:18 UTC
@Julien,
I'm asking before actual trying. Removal does work instantly? So so without a restart LibreOffice before the change is applied? [Asking because auto-complete, did require that (at least in the recent past)]
Comment 28 Julien Nabet 2021-01-11 12:56:02 UTC
(In reply to Telesto from comment #27)
> @Julien,
> I'm asking before actual trying. Removal does work instantly? So so without
> a restart LibreOffice before the change is applied? [Asking because
> auto-complete, did require that (at least in the recent past)]

Yes it worked without restarting LO (unless I missed something).