Bug 56346 - LibreOffice/Preferences/Language Settings/Voikko/Finnish writing aids Vocabulary recognition does not correlate with the presence of a recognized spell checker
Summary: LibreOffice/Preferences/Language Settings/Voikko/Finnish writing aids Vocabul...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Linguistic (show other bugs)
Version:
(earliest affected)
3.6.3.2 release
Hardware: x86-64 (AMD64) Mac OS X (All)
: medium normal
Assignee: Andras Timar
URL:
Whiteboard: target:3.7.0
Keywords:
Depends on:
Blocks:
 
Reported: 2012-10-24 08:13 UTC by rueter.jack
Modified: 2012-11-07 14:04 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
screen shot of (kpv) Komi-language text operating as Latin (244.30 KB, image/tiff)
2012-10-24 08:13 UTC, rueter.jack
Details
Screen shot of (kpv) Komi-language text operating as Latin (136.80 KB, image/png)
2012-10-25 13:23 UTC, Roman Eisele
Details
attachment-18096-0.html (4.78 KB, text/html)
2012-10-25 15:44 UTC, rueter.jack
Details
attachment-18096-1.dat (1 bytes, multipart/alternative)
2012-10-25 15:44 UTC, rueter.jack
Details
myv.zhfst (1.63 MB, application/octet-stream)
2012-10-25 15:44 UTC, rueter.jack
Details

Note You need to log in before you can comment on or make changes to this bug.
Description rueter.jack 2012-10-24 08:13:19 UTC
Created attachment 68984 [details]
screen shot of (kpv) Komi-language text operating as Latin

I am working with several minority languages producing speller.zhfst binary files for voikko/2/mor-(LANGUAGE-CODE) directories.

On my own MacBook Pro Snow Leopard, I can copy the required "speller.zhfst" and "voikko-fi_FI.pro" files to ~/.voikko/2/mor-la/.

The LibreOffice recognizes "Latin" as having a spell checker _ABC_ in green before the language. 

So I open an Erzya-language document (myv), 
activate the entire text
go to Tools/Language/For all Text/more...
I select Latin, which has the two files I described above

The same can be done with a Komi-language document (kpv)

NB!
IF, however, I make the directories 
~/.voikko/2/mor-myv/ for Erzya
~/.voikko/2/mor-kpv/ for Komi (Zyrian)
and copy the very same files 
I get a different result.

LibreOffice/Preferences/Language Settings/Voikko/Finnish writing aids Vocabulary recognizes the presence of the "voikko-fi_FI.pro" files.

But the Tools/Language/For all Text/more... does not show any evidence of spell checkers for either of the languages.

I am hoping to release open-source beta spell checkers for Erzya, Komi-Zyrian and possibly Meadow Mari this year. Next year more Uralic minority languages are in the works. Please help me resolve this problem.

Yours,
Jack Rueter
Comment 1 Roman Eisele 2012-10-25 13:23:31 UTC
Created attachment 69069 [details]
Screen shot of (kpv) Komi-language text operating as Latin


Thank you very much for your bug report!

However, I do not know how to help here, sorry. I have adapted some files and converted the TIFF screenshot to a PNG image (which works better in some browsers) to help others to find and, hopefully, process this bug report.
Comment 2 Roman Eisele 2012-10-25 13:25:30 UTC
@ Andras:

Can you help here? Or can you please point Jack Rueter to someone else who can help with these problems?

Thank you!
Comment 3 Andras Timar 2012-10-25 14:41:31 UTC
I think instead of "abusing" Latin for spellchecking these minority languages, we should add them to LibreOffice, so users can create documents in these languages. It would solve the spell checker recognition as well. I need the list of language names and ISO language codes.
Comment 4 rueter.jack 2012-10-25 15:44:45 UTC
Created attachment 69073 [details]
attachment-18096-0.html

Hi!

Here is the mapping.

struct Bcp47ToOOoMapping {
	const char * bcpTag;
	const char * oooLanguage;
	const char * oooRegion;
};


Khanty = "kca" "kca" "RU"
Komi-Zyrian = "kpv"  "kpv" "RU"
Livonian "liv" "liv" "LV"
Moksha = "mdf" "mdf" "RU"
Meadow Mari = "mhr" "mhr" "RU"
Hill Mari = "mrj" "mrj" "RU"
Erzya = "myv" "myv" "RU"
Nganasan = "nio" "nio" "RU"
Olonets = "olo" "olo" "RU"
Veps = "vep" "vep" "RU"
Võro = "vro" "vro" "EE"
Nenets = "yrk" "yrk" "RU"


The speller.zhfst files are constructed with open-source tools on the
Giellatekno infrastructure. and the Voikko data are available from their
server:
https://victorio.uit.no/langtech/trunk/langs/myv/tools/spellcheckers/hfst/
voikko-fi_FI.pro<https://victorio.uit.no/langtech/trunk/langs/myv/tools/spellcheckers/hfst/voikko-fi_FI.pro>

Presently the speller.zhfst file for Erzya will be available for down-load
at
divvun.no/static_files/myv.zhfst

The Erzya myv.zhfst accompanies this message. It is beta status.

On Thu, Oct 25, 2012 at 5:41 PM, <bugzilla-daemon@freedesktop.org> wrote:

>  Andras Timar <timar74@gmail.com> changed bug 56346<https://bugs.freedesktop.org/show_bug.cgi?id=56346>
>  What Removed Added  Status UNCONFIRMED NEEDINFO  Ever confirmed   1
>
>  *Comment # 3 <https://bugs.freedesktop.org/show_bug.cgi?id=56346#c3> on bug
> 56346 <https://bugs.freedesktop.org/show_bug.cgi?id=56346> from Andras
> Timar <timar74@gmail.com> *
>
> I think instead of "abusing" Latin for spellchecking these minority languages,
> we should add them to LibreOffice, so users can create documents in these
> languages. It would solve the spell checker recognition as well. I need the
> list of language names and ISO language codes.
>
>  ------------------------------
> You are receiving this mail because:
>
>    - You reported the bug.
>
>
Comment 5 rueter.jack 2012-10-25 15:44:45 UTC
Created attachment 69074 [details]
attachment-18096-1.dat
Comment 6 rueter.jack 2012-10-25 15:44:45 UTC
Created attachment 69075 [details]
myv.zhfst
Comment 7 Not Assigned 2012-11-06 20:23:12 UTC
Andras Timar committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=112d9e66d4c81168e955178c5c35480cb6303bb2

fdo#56346 add a few more Uralic languages to languages dropdown



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 8 Andras Timar 2012-11-07 14:04:34 UTC
Komi-Zyrian, Meadow Mari and Erzya have been there already, I added the rest.