Bug 106077 - Treat hyphenation character U+002D same as U+2010
Summary: Treat hyphenation character U+002D same as U+2010
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Linguistic (show other bugs)
Version:
(earliest affected)
5.2.5.1 release
Hardware: All Windows (All)
: medium enhancement
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Hyphenation
  Show dependency treegraph
 
Reported: 2017-02-18 15:34 UTC by Alfred Spalt
Modified: 2018-10-24 20:42 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:
Regression By:


Attachments
different hyphenation-characters in spell-checking (28.64 KB, image/jpeg)
2017-02-18 15:34 UTC, Alfred Spalt
Details
different hyphenation-charcters in spell-checking - improved (43.34 KB, image/jpeg)
2017-02-19 12:53 UTC, Alfred Spalt
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Alfred Spalt 2017-02-18 15:34:43 UTC
Created attachment 131319 [details]
different hyphenation-characters in spell-checking

Prerequisites:
* One word out of the standard dictionary, e.g. "downhill"
* Another word that was added to the dictionary, e.g. "wahnsinn"

Current Behavior:
When I combine words of the two categories described above (e.g. downhill-wahnsinn) using the linguistically correct hyphenation character U+2010, spelling is shown as correct.

However, when I use the ASCII hyphenation character U+002D, an error is shown.

Expected Behavior:
It would be great if LibreOffice spell checking treated the hyphenation character U+002D the same als U+2010.
Comment 1 Alfred Spalt 2017-02-19 12:53:28 UTC
Created attachment 131339 [details]
different hyphenation-charcters in spell-checking - improved
Comment 2 Xisco Faulí 2018-06-12 10:49:32 UTC
Thank you for reporting the bug.
Could you please try to reproduce it with the latest version of LibreOffice
from https://www.libreoffice.org/download/libreoffice-fresh/ ?
I have set the bug's status to 'NEEDINFO'. Please change it back to
'UNCONFIRMED' if the bug is still present in the latest version.
Comment 3 Alfred Spalt 2018-06-20 14:54:22 UTC
The "bug" is still present in version 6.0.4.2.
However, it is not a bug at all. Just a change request. As I stated before: a feature that would be nice to have.
Comment 4 Alfred Spalt 2018-06-20 14:56:13 UTC
I just noticed, that character U+2010 is not present in some fonts. I reproduced the "bug" with U+2012, which is present e.g. in the Liberation font family.
Comment 5 Buovjaga 2018-06-24 19:13:24 UTC

*** This bug has been marked as a duplicate of bug 85731 ***
Comment 6 Khaled Hosny 2018-06-26 17:29:50 UTC
This is a different issue than bug 85731 which is about the character being inserted at line break during hyphenation (i.e. output), not which characters are recognized as hyphen during input.
Comment 7 Alfred Spalt 2018-06-26 20:03:25 UTC
From a user's perspective, the situation is even more tricky:

Hyphen characters U+002D and U+2010D ARE both treated as word separators, if both words are in the standard dictionary.
U+002D is no longer treated as word separator as soon as one of the words is taken from a user's dictionary. See attachment 131339 [details].

So for me it looks like this is not an issue of the hyphenation library but rather one of how LO treats words from different dictionaries.