Bug 164062 - Spellcheck: Words ending in hyphen are not recognized
Summary: Spellcheck: Words ending in hyphen are not recognized
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Jonathan Clark
URL:
Whiteboard: target:25.2.0 target:24.8.5
Keywords:
Depends on:
Blocks: Spell-Checking
  Show dependency treegraph
 
Reported: 2024-11-26 21:14 UTC by Lars Jødal
Modified: 2024-12-02 17:25 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
Sample Writer file with words ending in hyphen. (15.64 KB, application/vnd.oasis.opendocument.text)
2024-11-26 21:16 UTC, Lars Jødal
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Lars Jødal 2024-11-26 21:14:59 UTC
Description:
At least in Danish, there are some phrases where a word will only be valid if ending in a hyphen. As an example, the English phrase "child and youth hospital" will in Danish be "børne- og ungehospital", where "børne-" is valid, but "børne" without a hyphen is not valid. The Danish dictionary takes this into account (checked with Hunspell), but LO does not recognize this construction.

Guess: Likely, LO removes a trailing hyphen before presenting Hunspell with the word. At least, that would be consistent with the seen behaviour, as Hunspell would only see "børne", even when "børne-" was written.

Steps to Reproduce:
1. Open the example file, which includes a description in English and examples in Danish.
2. If the Danish dictionary is not installed, install it from the LO installation or from the extensions repository.
3. See which words are reported as misspelled.

Actual Results:
Words like "børne-" and "arbejds-" are reported as misspelled, even though they are correct. The check-words without hyphen, "børne" and "arbejds" are indeed wrong. (If no words are reported wrong, you do not have the Danish dictionary installed.)

Expected Results:
The words "børne-" and "arbejds-" should be considered correct.


Reproducible: Always


User Profile Reset: No

Additional Info:
This is old behaviour, present at least as early as LO 3.6, and probably inherited from OOo. The use of hyphens in this way come from compound words with "fugeelements". Likely, similar examples can be found for German.
Comment 1 Lars Jødal 2024-11-26 21:16:00 UTC
Created attachment 197812 [details]
Sample Writer file with words ending in hyphen.
Comment 2 Jeppe Bundsgaard 2024-11-27 14:17:00 UTC
I have opened the document, and I can confirm the error.
Hunspell considers børne- correct:
> $ hunspell -d Downloads/da_DK 
> Hunspell 1.7.2
> børne-
> *
Comment 3 Commit Notification 2024-11-29 21:09:34 UTC
Jonathan Clark committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/f23c1baa4646957ad8a7060376638935a5e87889

tdf#164062 i18npool: Added da_DK as a prepostdash language

It will be available in 25.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 4 Commit Notification 2024-12-02 10:28:41 UTC
Jonathan Clark committed a patch related to this issue.
It has been pushed to "libreoffice-24-8":

https://git.libreoffice.org/core/commit/e070cba008f593adb6a51b52855563a6d6a49761

tdf#164062 i18npool: Added da_DK as a prepostdash language

It will be available in 24.8.5.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 5 Lars Jødal 2024-12-02 17:25:57 UTC
I can confirm that the problem has been resolved with the current Master version.

Version: 25.2.0.0.alpha1+ (X86_64) / LibreOffice Community
Build ID: 44ccd392be12dad23e216fb3eb2c2e5b275eee75
CPU threads: 4; OS: Windows 10 X86_64 (10.0 build 19045); UI render: Skia/Raster; VCL: win
Locale: da-DK (da_DK); UI: en-US
Calc: threaded

The fix works fine, even it can only partly be seen with my test file attachment 197812 [details]: With the current dictionary and the fix, "børne-" works will, while "arbejds-" are still underlined. This, however, relates to the dictionary: I had forgotten that the trailing hyphen was not implemented for all 'fugeelements' (because implementing it for some cases did not change anything at that time).

So I have updated the Danish dictionary, so far at the development site, www.stavekontrolden.dk, and the plan is to make the release an updated version at the official extensions site and make the updated version part of the LO distribution. With the fix and the updated dictionary, the test file works as expected for both cases (and for other cases not in the test file).

Thanks for fixing this problem.