Bug 116072 - Improve spell checking by better word breaking for Hungarian
Summary: Improve spell checking by better word breaking for Hungarian
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Localization (show other bugs)
Version:
(earliest affected)
5.4.0.0.alpha0+
Hardware: All All
: medium normal
Assignee: László Németh
URL:
Whiteboard: target:6.1.0 target:6.0.3
Keywords:
Depends on:
Blocks:
 
Reported: 2018-02-27 21:37 UTC by László Németh
Modified: 2018-03-15 16:06 UTC (History)
0 users

See Also:
Crash report or crash signature:


Attachments
False alarms in the special Hungarian word forms (8.24 KB, image/png)
2018-02-27 21:47 UTC, László Németh
Details
test file (15.92 KB, application/vnd.oasis.opendocument.text)
2018-03-14 15:05 UTC, László Németh
Details

Note You need to log in before you can comment on or make changes to this bug.
Description László Németh 2018-02-27 21:37:34 UTC
Following cases results false alarms during Hungarian spell checking:

példá(k)
csinál(hat)nak
(1)-ben
[1]-ben
Mi van?-nal
Helló!-val
„Élet és Irodalom”-ban

Suggested solution: handling (, ), ], ”, ?, ! as MidLetter during word breaking will pass the previous word forms to Hunspell, where using Hunspell feature IGNORE and BREAK can ignore the parentheses and recognize stems and hyphen+affixes.
Comment 1 László Németh 2018-02-27 21:47:54 UTC
Created attachment 140197 [details]
False alarms in the special Hungarian word forms
Comment 2 Commit Notification 2018-02-28 07:44:01 UTC
László Németh committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=3cc58a5904604f0f2e9977a9508bab02518122ad

tdf#116072 Extend MidLetter in Hungarian word breaking

It will be available in 6.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 4 László Németh 2018-02-28 08:29:55 UTC
(Note: test with „Mi van?-nak”, not „Mi van?-nal”. The affix -nal/-nel, -tyal/-tyel versions of -val/-vel will be added in the next bigger release of the dictionary.)
Comment 5 Commit Notification 2018-03-01 11:44:10 UTC
László Németh committed a patch related to this issue.
It has been pushed to "libreoffice-6-0":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=c85aa97d4e20091db71899f713b07e3c57b3b7ad&h=libreoffice-6-0

tdf#116072 Extend MidLetter in Hungarian word breaking

It will be available in 6.0.3.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 6 László Németh 2018-03-14 15:05:46 UTC
Created attachment 140632 [details]
test file

Fixing this issue removes all false alarms with affixes, for example word part "ban" and "ben" in the form "a)-ban" and "b)-ben" (also when "a" and "b" come from automatic references, as in this test file).
Comment 7 Commit Notification 2018-03-15 16:06:00 UTC
László Németh committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=6ef7c2729fb85959dfd76f028166f7631886399c

tdf#116072 Add PrefixLetter ")" in Hungarian word breaking

It will be available in 6.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.