Bug 143490 - COVID-119 & Covid-119 should fail spell-check
Summary: COVID-119 & Covid-119 should fail spell-check
Status: RESOLVED NOTOURBUG
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
7.1.4.2 release
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Dictionaries
  Show dependency treegraph
 
Reported: 2021-07-21 22:44 UTC by Nick Levinson
Modified: 2021-08-01 05:53 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
COVID entries in GB dictionary (90.99 KB, image/png)
2021-07-24 04:27 UTC, Marco A.G.Pinto
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Nick Levinson 2021-07-21 22:44:20 UTC
Description:
Adding Covid-19 or COVID-19 to the standard user dictionary makes no difference; they still get marked as partial misspellings.

Steps to Reproduce:
1. In a Writer document, write the following:
Covid-19
COVID-19

2. You may need to press the Enter key for the spell-check to work on the last entry.

3. Try to make what you wrote pass spell-checking.

Actual Results:
You can't. Writer treats Covid or COVID as its own string. Even if you add Covid-19 and COVID-19 to the user dictionary manually (Spelling dialog > Options > standard > Edit), Writer still considers Covid inside Covid-19 and COVID inside COVID-19 to be misspelled. Adding Covid and COVID to the dictionary is not a fix because then Writer wouldn't catch Covid-18 and COVID-9 as errors.

Expected Results:
Have Covid-19 and COVID-19 pass the spell-check.


Reproducible: Always


User Profile Reset: No



Additional Info:
My guess is that this is about an alpha string and a number string in either order and hyphen-separated. If that's the problem, the summary of this report probably should be edited.

Probably not a regression. I think I've had this problem since I first wrote about the virus. I don't remember pre-viral strings with the same characteristics.

Platform: Fedora 34 Linux, kept evergreen. A clean F34 installation was performed after the problem began, so Safe Mode need not be tested.

I don't think I have OpenGL.

From About LO:

Version: 7.1.4.2
Build ID: 10(Build:2)
CPU threads: 2; OS: Linux 5.12; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded

That it's not as regression is supported by the same problem being found in an earlier LO Writer. From the earlier LO's About, on Ubuntu 20.04.2 LTS Linux, kept evergreen:
Version: 6.4.7.2
Build ID: 1:6.4.7-0ubuntu0.20.04.1
CPU threads: 2; OS: Linux 5.8; UI render: default; VCL: gtk3; 
Locale: en-US (en_US.UTF-8); UI-Language: en-US
Calc: threaded
Comment 1 Aron Budea 2021-07-21 23:20:11 UTC
The US English dictionary now contains COVID as an entry, which means COVID-19 will be accepted as well. This update arrived after LO 7.1, and will be included with the 7.2 release.

Otherwise, LO dictionaries are maintained separately, issues in the US English word list can be raised here:
https://github.com/en-wl/wordlist/issues
Comment 2 Nick Levinson 2021-07-24 02:17:57 UTC
That's half a solution. The string COVID-119 should fail but, per your explanation, won't. Even if a coronavirus called COVID-119 is in the world's lexicon, a rarely-used word should not be in the LO dictionary because the string in an average Writer document would more likely be a misspelling of something else and therefore should be treated as a misspelling. The same principle applies to COVID-18 and COVID-9, which I suppose exist but those strings are more likely to be errors. As to case, I can add Covid to an LO dictionary but LO won't let Covid-119 be shown as wrong.

This looks like a generic problem about an alpha string and a numstring in either order and separated by a hyphen. If so, it is not just a matter of adding one of the strings to a dictionary.

Thus, I'm reopening.
Comment 3 Marco A.G.Pinto 2021-07-24 04:27:05 UTC
Created attachment 173817 [details]
COVID entries in GB dictionary

Hello!

The COVID words have been in the British dictionary since last year or so.

If COVID-119 doesn't trigger an error is because Hunspell in hyphenated words simply validates both words, so:
COVID is valid
then it has a hyphen
and 119 is valid.

So, both words are valid and Hunspell doesn't trigger an error.

The way of fixing this would be to implement a rule in the grammar checker LanguageTool or Lightproof that would check for the number after it.
Comment 4 Nick Levinson 2021-07-26 14:42:12 UTC
Help says the grammar checker is an extension, so I'm closing this. Thanks.
Comment 5 Shantanu 2021-07-31 04:32:28 UTC
Or you can add an entry in auto-correct:

COVID-119 >> COVID-19
Comment 6 Aron Budea 2021-08-01 05:53:59 UTC
(In reply to Shantanu from comment #5)
> Or you can add an entry in auto-correct:
The limitation of autocorrect is that it can only handle specific replacements, you could have one for Covid-119, but how about Covid-199 or Covid-18 as examples of different typos.