Bug 114717 - SPELL: spellchecker does nonsense suggestions for de-DE
Summary: SPELL: spellchecker does nonsense suggestions for de-DE
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Linguistic (show other bugs)
Version:
(earliest affected)
5.4.4.2 release
Hardware: All Windows (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Dictionaries
  Show dependency treegraph
 
Reported: 2017-12-27 19:09 UTC by Rainer Bielefeld Retired
Modified: 2023-04-19 03:24 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
sample document (170.77 KB, application/vnd.oasis.opendocument.text)
2017-12-27 19:09 UTC, Rainer Bielefeld Retired
Details
Some more test results (67.53 KB, application/vnd.oasis.opendocument.text)
2017-12-29 06:36 UTC, Rainer Bielefeld Retired
Details
2 more nonsense suggestions (fist 2 ones) (17.40 KB, image/png)
2018-01-15 12:52 UTC, Rainer Bielefeld Retired
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Rainer Bielefeld Retired 2017-12-27 19:09:42 UTC
Created attachment 138694 [details]
sample document

Steps how to reproduce with Version: 5.4.4.2 (x64)
Build-ID: 2524958677847fb3bb44820e40380acbe820f960
CPU-Threads: 4; BS: Windows 6.1; UI-Render: Standard; 
Gebietsschema: de-DE (de_DE); Calc: group, my default user profile, Tango theme

1. Open attached sample document 
3. <f7> for spellcheck
   » check stops at "Brennerbetrieb" because unknown, suggests better words
   Bug: The list contains nonsense words like Brenner-betrieb, Brennerbtrieb,
        Brennerdbetrieb, Brennwerbetrieb, Brennerzbetrieb

a) These nonsense words are not in the dictionaries I tested.
b) I wonder where I can see my installed dict-de_AT-frami_2017-01-12?
c) Strange: I also see this in LibO 4.2 and 3.3, even OOo 4.2. 
c1) So this might be some OS - "Feature" or something like that? 
    What might mean "not a LibO Bug?"
d) The bad thing is that all these nonsense words are possible typos, which
   will not be complained by spellcheck.
e) Still REPRODUCIBLE with Version: 6.1.0.0.alpha0+ (x64)
   Build ID: c926a1e34672afaa5b7de0e3b08b1537e88fbb6f CPU threads: 4; 
   OS: Windows 6.1; UI render: default; 
   TinderBox: Win-x86_64@42, Branch:master, Time: 2017-12-24_01:10:03
   Locale: de-DE (de_DE); Calc: CL

I will test on an other PC tomorrow
Comment 1 Julien Nabet 2017-12-28 16:13:01 UTC
Andras: thought you might be interested in this one since it concerns dictionaries
Comment 2 Aron Budea 2017-12-28 21:15:59 UTC
The straightforward explanation is that the words can be generated from the dictionary (words + grammar rules). I don't know if that's the case, so I'd suggest to first contact the extension/dictionary owner about these errors.

The shipped dictionaries are in <LO install dir>\share\extensions.
The installed extensions seem to be in <LO install dir>\share\uno_packages\cache\uno_packages, if they are for all useres, and in <user profile>\uno_packages\cache\uno_packages if they are for a single user.
Comment 3 Marco A.G.Pinto 2017-12-28 22:13:26 UTC
My suggestion as a dictionary maintainer is that you should try to select one by one the nonsense words from the suggestions.

If all of them don't show as typos, is because they are in the speller.

The hyphenated one means that the speller has A and B. Ex: A-B

Hunspell accepts A-B if A and B exist.
Comment 4 Rainer Bielefeld Retired 2017-12-29 06:36:27 UTC
Created attachment 138728 [details]
Some more test results

I did a more detailed search in the directories, for details pleas see attached "dictionarysearch.odt". Nonsense word "Brennerdbetrieb" has not been found in any of the directories.
Comment 5 Rainer Bielefeld Retired 2017-12-29 08:31:06 UTC
Before of all that philosophy how spellcheck might work someone should try to reproduce the reported issue.

f) I did an additional test concerning reliability of Agent Ransack: 
   "brenner1ein1stellung" has been found in 3 _de hyphenation dictionaries
Comment 6 Aron Budea 2018-01-01 20:10:24 UTC
I get the same suggestions as you, and still advise that you contact the dictionary/extension maintainer about them.
Comment 7 Rainer Bielefeld Retired 2018-01-15 12:52:09 UTC
Created attachment 139109 [details]
2 more nonsense suggestions (fist 2 ones)

(In reply to Aron Budea from comment #6)
If I will decide to create an office suite with German spell check, I will think about your suggestion.
Comment 8 László Németh 2018-01-15 15:58:17 UTC
The word "betrieb" has no rule for compound word generation, but the other words (Erbe + trieb etc.) do.

Short solution is the dictionary extension: adding the word "Brennerbetrieb" or a "compound-friendly" "betrieb".

Future – planned – solution for the strange suggestions is more clever limitation/checking of generated compound words, based on real-world occurrence.

Unfortunately, in German, Hunspell (spell checker of LibreOffice) permits more strange suggestions, despite the recent optimizations here (https://github.com/hunspell/hunspell/commit/90cb55f8f1a21c7f62539baf8f3cf6f062080afd, https://github.com/hunspell/hunspell/commit/4e4106fc64bc26df10f8dc24e0e578abb70025c7).

Thanks for your bug report!
Comment 9 Xisco Faulí 2018-04-17 11:22:31 UTC
Dear László Németh,
This bug has been in ASSIGNED status for more than 3 months without any
activity. Resetting it to NEW.
Please assigned it back to yourself if you're still working on this.
Comment 10 QA Administrators 2019-04-18 03:03:17 UTC Comment hidden (obsolete)
Comment 11 QA Administrators 2021-04-18 03:48:18 UTC Comment hidden (obsolete)
Comment 12 QA Administrators 2023-04-19 03:24:09 UTC
Dear Rainer Bielefeld Retired,

To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year.

There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present.

If you have time, please do the following:

Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/

If the bug is present, please leave a comment that includes the information from Help - About LibreOffice.
 
If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice.

Please DO NOT

Update the version field
Reply via email (please reply directly on the bug tracker)
Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not 
appropriate in this case)


If you want to do more to help you can test to see if your issue is a REGRESSION. To do so:
1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from https://downloadarchive.documentfoundation.org/libreoffice/old/

2. Test your bug
3. Leave a comment with your results.
4a. If the bug was present with 3.3 - set version to 'inherited from OOo';
4b. If the bug was not present in 3.3 - add 'regression' to keyword


Feel free to come ask questions or to say hello in our QA chat: https://web.libera.chat/?settings=#libreoffice-qa

Thank you for helping us make LibreOffice even better for everyone!

Warm Regards,
QA Team

MassPing-UntouchedBug