Bug 77508 - unwanted 8 --> 8° and 11 --> 11° autocorrection from the acor_und.dat file
Summary: unwanted 8 --> 8° and 11 --> 11° autocorrection from the acor_und.dat file
Status: RESOLVED MOVED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Linguistic (show other bugs)
Version:
(earliest affected)
4.2.0.4 release
Hardware: Other Windows (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-04-16 04:29 UTC by tommy27
Modified: 2015-02-11 12:45 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
autocorrect database (1.09 MB, application/zip)
2014-04-16 04:29 UTC, tommy27
Details
autocorrect options screenshot (67.09 KB, image/png)
2014-04-16 19:58 UTC, tommy27
Details
Crash on entry and autocorrect of numeral 8 (15.59 KB, image/png)
2014-04-17 13:19 UTC, V Stuart Foote
Details

Note You need to log in before you can comment on or make changes to this bug.
Description tommy27 2014-04-16 04:29:53 UTC
Created attachment 97442 [details]
autocorrect database

tested under Win7x64 and WinXPx32
......................................
STEPS TO REPRODUCE
1- download attached acor_und.dat file
2- place it in the autocorr subfolder of the user profile
3- open Writer and type 11

......................................
DESCRIPTION

after upgrading to the 4.2.x branch and migrating my user profile autocorrection databases, I've discovered weird and unwanted automatic text correction involving the number 11.

if I type 11 it get autocorrected to 11°
other unwanted autocorrections are:

111  --> 11°1
1111 --> 11°11
11)  --> 11°)
11a  --> 11°A
11-  --> 11°-
11/  --> 11°
11.  --> 11°.

moreover once you get this autocorrect and go back with cursor and hit space key, another ° is automatically added...  I mean 11° --> 11°°

interestingly if the "11" is preceded by another number different than "1" or a letter, you will not have any unwanted autocorrection... I mean 211 and A11 are not corrected to 211° or A11°, while 111 and 11A become 111° and 11A°

this strange issue is related to some conflicts inside the attached acor_und.dat file which is the storage of my autocorrection replacement list.

put it in you autocorr subfolder in the user profile to test the behaviour.

the strange thing is that .dat file does not contain any of those autocorrection entry that LibO applies to typing.

If you inspect the replacement table with "Tools/Autocorrect Options" you see that no 11 --> 11° entry exist.

moreover the same acor_und.dat file used in 4.1.5 and earlier version never triggered those weird autocorrect.

this autocorrect anomaly and regression comes first with 4.2.0.4 and persists in 4.2.3.3 and recent 4.3 alpha releases.
Comment 1 tommy27 2014-04-16 16:25:48 UTC
just discovered that the same issue happens with number 8 as well...

8  --> 8°
88 --> 8°8
8)  --> 8°)
8A  --> 8°A
etc. etc.

intestingly no issue with 011 and 08
whilst 110 and 80 get autocorrected to 11°0 and 8°0

no issue with other numbers. very weird bug

updated summary notes
Comment 2 tommy27 2014-04-16 19:58:28 UTC
Created attachment 97481 [details]
autocorrect options screenshot

took a screenshot showing my current autocorrect options configuration.
maybe it can help debugging.

as said before the weird this is that I get unwanted automatic correctiond despite not having such autocorrect entris in the acor_und.dat file

I have no clue what triggers this bug.
Comment 3 tommy27 2014-04-17 11:44:20 UTC
issue confirmed by another Window user in the italian OpenOffice newsgroup 
see discussion (italian language) here: http://snipurl.com/28tj3ft

Vitriol who's in CC-list can confirm it.

set status to NEW
Comment 4 tommy27 2014-04-17 11:48:46 UTC
just to add that this autocorrect issue affects Writer, Calc and Impress as well.
Comment 5 V Stuart Foote 2014-04-17 13:19:49 UTC
Created attachment 97516 [details]
Crash on entry and autocorrect of numeral 8

Not getting exactly the same errors.

On Windows 7 sp1, 64-bit en-US
Version: 4.3.0.0.alpha0+
Build ID: 087a79db1272858f107656c5ca3c6efb45680986
TinderBox: Win-x86@39, Branch:master, Time: 2014-04-16_01:43:37

Have loaded the custom autocorrect table as provided and adjusted the Tools -> Autocorrect options -> Options "Use replacement table"--the M (while Modifying) box is unchecked, the T box (while Typing) is set.

In writer entry of initial character followed by a return is slow with autocorrect while Typing is set. With it off, the return is immediate.

Entry of the number 1 does not result in replacement with 1 and degree symbol, so can not confirm that.

However, the number 8 is causing issues!  Entering two or more 8's is causing a crash of this en-US TB-39 build, it pops up with the attached error.

Not seeing any issue in calc.
Comment 6 tommy27 2014-04-17 13:32:27 UTC
(In reply to comment #5)
> Created attachment 97516 [details]
> Crash on entry and autocorrect of numeral 8
> 
> ....
>
> In writer entry of initial character followed by a return is slow with
> autocorrect while Typing is set. With it off, the return is immediate.
> 

this may due to the large volume of autocorrect entries stored in that file which slow down Writer.

it's an already known issue, see Bug 55570, which was partially fixed my Micheal Meeks (the slowdown was even worse in previous LibO releases)

> Entry of the number 1 does not result in replacement with 1 and degree
> symbol, so can not confirm that.

the initial report is about number eleven "11"
I have no issue with number one "1" alone

> However, the number 8 is causing issues!  Entering two or more 8's is
> causing a crash of this en-US TB-39 build, it pops up with the attached
> error.

I see no crash with my 4.3alpha which is slightly older than the one you tried.

> Not seeing any issue in calc.

maybe you could try 4.2.3.3 where I see the exact bug in all LibO applications.
Comment 7 V Stuart Foote 2014-04-17 13:43:51 UTC
So can confirm that with the custom auto correct table, if in writer I enter two "1"s in succession followed by a space, the auto correct will replace the second to become 11° and repeats with every double entry of a "1" followed by a space. 

While entering a single "8" followed by a space results in a 8°, and if more than three "8"s are strung together the edit engine crashes (editenglo.dll). Also, entering two "8"s followed by return will crash.

For this TB-39 build of master, entry of text in a calc cell or in the formula bar is not affected.

Entry of "1" and "8"s in impress, is affected the same as writer--including the crash in editenglo.dll
Comment 8 V Stuart Foote 2014-04-17 13:54:22 UTC
On Windows 7 sp1, 64-bit en-US
Version: 4.2.3.3
Build ID: 6c3586f855673fa6a1576797f575b31ac6fa0ba3

With the autocorrect database installed to user profile, I am seeing the same corrections as in OP and 2nd comment.  Including presence in calc.

With this release, no crash with number 8.
Comment 9 V Stuart Foote 2014-04-17 13:59:37 UTC
(In reply to comment #7)
> 
> For this TB-39 build of master, entry of text in a calc cell or in the
> formula bar is not affected.

Version: 4.3.0.0.alpha0+
Build ID: 087a79db1272858f107656c5ca3c6efb45680986
TinderBox: Win-x86@39, Branch:master, Time: 2014-04-16_01:43:37

Actually, if I enter 11/ 11) or 8) 8/ in a calc cell or the formula bar it does crash the edit engine (editenglo.dll). So calc autocorrect is slightly altered but is also affected in current master.
Comment 10 tommy27 2014-04-17 14:03:34 UTC
@Stuart
thanks for your tests.

this highlights that situation is even worse in 4.3master where you see crash, while in 4.2.x it does unwanted corrections without crash.

probably some changes in the autocorrect engine have been done in 4.2.x and 4.3.x causing this issues since the same .dat file works fine in 4.1.5 and earlier releases.

I add a Writer expert to CC list, maybe he know where to look in the code.
issue affects 4.2.0 and following releases and has been probably introduced into early 4.2.x development.
Comment 11 tommy27 2014-04-17 18:01:10 UTC
as an additional note, renaming the acor_und.dat file to any other language, like acor_it-IT.dat or acor_en-US.dat file doesn't change the behaviour of the bug.

I mean, if you rename it to italian, and you write 8 or 11 in an italian language document, the unwanted 8° and 11° autocorrections get triggered as well.

the key is probably inside that file.
AFAIK that .dat file is a zip containing .xml files and the autocorrect table is inside the DocumentList.xml file.

I suspect that in some way LibO 4.2.x reads that .xml in a different way in respect to 4.1.x and "sees" autocorrect entries that do not exist...
maybe some encoding issues?
Comment 12 tommy27 2014-04-17 20:49:42 UTC
GOTCHA!!!

the "key" is the asterisk key "*" being interpreted by LibO 4.2.x as a wildcard instead as a regular typing character like it did in 4.1.x

I have created a clean followup bug report at Bug 77593
let's continue discussion over there

regarding the 4.3.x crash reported by Stuart let's keep in on hold...
Comment 13 tommy27 2014-04-17 22:04:57 UTC
I've just figured out that the asterisk wilcard thing "was not a bug but a feature"  :-)   see Bug 68373

anyway I just wanted to add that I do not see the crash Stuart experience using a brand new master build

Version: 4.3.0.0.alpha0+
Build ID: 4211ce7c62677c65dfbbb3602be6c36fd0f98977
TinderBox: Win-x86@42, Branch:master, Time: 2014-04-17_07:49:30

@Stuart
do you still see crashes with master? 
did you try another tinderbox?