Bug 73223 - [SPELLCHECKER] Wrong suggestion of compound terms when in lower letters
Summary: [SPELLCHECKER] Wrong suggestion of compound terms when in lower letters
Status: RESOLVED WORKSFORME
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Linguistic (show other bugs)
Version:
(earliest affected)
4.3.0.0.alpha0+ Master
Hardware: All All
: medium normal
Assignee: László Németh
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Spell-Checking
  Show dependency treegraph
 
Reported: 2014-01-02 13:33 UTC by Olivier Hallot
Modified: 2022-09-05 09:53 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Olivier Hallot 2014-01-02 13:33:38 UTC
Quite often, because users don't know about the hiphenation tool, they force the hiphenation manually. As result, the spelling checker correclty flags the wrong hyphenation, identifying as a false compounds.

The problem is that in the suggested corrections, it creates several compounds that does not exist. This flaw only happens when we have the terms written in lower letters. With capital letters, the suggestions are correct. This must happen in the other languages too. Someone (László?) will have to download VERO to check this flaw, as I can't check in other languages. 

Examples:
 
en-contradas
subse-quente
convo-cado
nú-mero
ofi-cial
contra-ditório
as-sinatura
ce-lebração
an-tecipados
pe-nalidades
di-reito


EN-CONTRADAS
SUBSE-QUENTE
CONVO-CADO
NÚ-MERO
OFI-CIAL
CONTRA-DITÓRIO
AS-SINATURA
CE-LEBRAÇÃO
AN-TECIPADOS
PE-NALIDADES
DI-REITO

To download VERO as extension (pt-BR spell checker), please go to the VERO extension page in pt-br portal:

http://tinyurl.com/6msbv3j
Comment 1 A (Andy) 2014-11-29 10:00:41 UTC
Reproducible with LO 4.3.4.1, English and German, but if I write the words in capitals I get no correction suggestions.


Steps done:

1. Open WRITER and assure that the page format is A4 with all margins = 2.00 cm (FORMAT -> PAGE -> tab PAGE) and the selected font is "Liberation Serif" with font size = 12

2. Write an English and German sample text, I used the following:
This is a large house with a very nice garden around. In the garden there are trees which are beautiful.
This is a large house with a very nice garden around. In the garden there are trees which are BEAUTIFUL.
Das ist ein Haus mit einem sehr schönen Garten. In dem Garten finden sich Bäume, die wunderschön sind.
Das ist ein Haus mit einem sehr schönen Garten. In dem Garten finden sich Bäume, die WUNDERSCHÖN sind.

3. Select the first two paragraphs and go to TOOLS -> LANGUAGE -> FOR SELECTION -> English (UK) or (USA)

4. Select the last two paragraphs and go to TOOLS -> LANGUAGE -> FOR SELECTION -> German (GERMANY)

5. Add a minus to have a manual hyphenation: "beautiful" -> "beauti-ful", "BEAUTIFUL" -> "BEAU-TIFUL", "wunderschön" -> "wun-derschön", "WUNDERSCHÖN" -> "WUN-DERSCHÖN" 

Interim Results:
LO marks "beauti-ful" and "wun-derschön" a wrong (they are underlined)

6. Go to TOOLS -> SPELLING AND GRAMMAR


Result:
LO suggests the following non-existing "corrections":
beauti-ful: beaut-ful, beauts-ful, beauty-ful, beau-ful, beatitude-ful, beating-ful
BEAU-TIFUL: no correction
wun-derschön: wund-derschön, wen-derschön, nun-derschön, tun-derschön, Sun-derschön, Run-derschön, Tun-derschön
WUN-DERSCHÖN: no correction

After choosing one of these suggestions, LO marks them also as wrong. -> Bug

LO marks "wun-derschön" as wrong, but not "wunder-schön".  Actually of course both are correct in this case.  I suppose that if we add a minus to hyphenate a word manually, LO doesn't recognize it as a hypenation, but sees them a compound words and then checks every word part itself.

In German: "derschön" does not exist as a word and is therefore also not possible as a compound word. -> Bug


-> Enhancement Request:

I would like to suggest that it would also be possible to make a manual hypenation with a minus.  LO should then check if this word with a minus is at a line break and if yes then LO should check it as "one" word (also possible correction suggestions).  If this word with a minus is not at a line break then LO should check it as compound words and all words separatly.  
But I am not sure, maybe such an enhancement has also drawbacks?
Maybe it would otherwise be possible to add a shortcut and/or an icon in the toolbar to add a manual hypenation.
Comment 2 QA Administrators 2015-12-20 16:13:03 UTC Comment hidden (obsolete)
Comment 3 A (Andy) 2015-12-26 22:39:55 UTC
Reproducible with LO 5.1.0.1, Win 8.1

For BEAU-TIFUL and WUN-DERSCHÖN it shows as one alternative the correct correstions.
Comment 4 QA Administrators 2017-01-03 19:57:58 UTC Comment hidden (obsolete)
Comment 5 Thomas Lendo 2018-10-10 13:14:06 UTC
Still reproducible.

Version: 6.0.6.2 (x64)
Build-ID: 0c292870b25a325b5ed35f6b45599d2ea4458e77
CPU-Threads: 8; BS: Windows 10.0; UI-Render: Standard; 
Gebietsschema: de-AT (de_AT); Calc: group
Comment 6 QA Administrators 2019-10-11 02:35:19 UTC Comment hidden (obsolete)
Comment 7 Telesto 2020-09-04 22:05:44 UTC
Tested wunderschön; working pretty ok
Version: 7.0.0.2
Build ID: c01aa64b6c3d89ebe5fe69c28c7adb24eb85249c
CPU threads: 4; OS: Mac OS X 10.12.6; UI render: default; VCL: osx
Locale: nl-NL (nl_NL.UTF-8); UI: en-US
Calc: threaded
Comment 8 QA Administrators 2022-09-05 03:37:56 UTC Comment hidden (obsolete)
Comment 9 Olivier Hallot 2022-09-05 09:53:02 UTC
The issue now is related only when the hyphen follows an accented letter.

In the words above, only nú-mero generates a list of inexistent words, but, offers número as good option.

At the moment, I consider this issue as Work for me.

Version: 7.3.5.2 / LibreOffice Community
Build ID: 184fe81b8c8c30d8b5082578aee2fed2ea847c01
CPU threads: 8; OS: Linux 5.15; UI render: default; VCL: kf5 (cairo+xcb)
Locale: pt-BR (pt_BR.UTF-8); UI: pt-BR
Calc: threaded