35590 – Adding a new dictionary for the Maltese Language

Bug 35590 - Adding a new dictionary for the Maltese Language

Summary: Adding a new dictionary for the Maltese Language

Status:	RESOLVED INVALID

Alias:	None

Product:	LibreOffice
Classification:	Unclassified
Component:	Linguistic (show other bugs)
Version: (earliest affected)	unspecified
Hardware:	All All

Importance:	medium enhancement
Assignee:	Not Assigned

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2011-03-23 05:00 UTC by Clifton
Modified:	2012-08-31 10:05 UTC (History)
CC List:	2 users (show)

See Also:
Crash report or crash signature:

Attachments
Dictionary Files (17.28 KB, application/zip) 2011-03-23 05:00 UTC, Clifton	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Clifton 2011-03-23 05:00:28 UTC

Created attachment 44750 [details]
Dictionary Files

I am suggesting the implementation of the Maltese dictionary of the Island of Malta, Europe. I am attaching the txt file and .dic file with what I have started. There are 2348 words already. It would be great since it will be the only office suite that has such a spell checker. UTF-8. is there an easy way to update them. Some words end with a '  will that effect the detection?

Comment 1 Caolán McNamara 2011-03-24 06:20:25 UTC

FWIW in fedora we've been shipping "hunspell-mt" since 2008 to provide Maltese spell-checking by running "wordlist2hunspell" over the Maltese word list of http://linux.org.mt/projects/spellcheck/ 

Have you spoken to Ramon about his wordlist and formally converting them to hunspell format ? The above list has 500,000+ words in it apparently.

Its not strictly-speaking necessary to have the dicts in LibreOffice itself as we support dictionary extensions, there should be a lot of examples around to follow.

As a small aside, in your mt_MT.dic the first line is supposed to be the count of lines in the .dic, so "2348 mt_MT.txt" should instead be "2348" (well, should remove the blank line under it and recalculate the number of lines)

re the ' in Maltese, LibreOffice will likely split word around ' and send each bit to the spell-checker separately. I think I spoke to Ramon about this once

Comment 2 Caolán McNamara 2011-03-24 06:24:54 UTC

Ah yes, here's what I have in my mail from 2009, which doesn't mean I'm right, but just a data point about handling the ' in words

...
        > The Maltese language has words that end in a dash or an apostrophe,
        > like [ jista' ] or [ bil- ]. I added those characters to hunspell's
        > affix file as WORDCHARS and the command-line version of hunspell works
        > fine. However OOo apparently does not use that setting, so if I have
        > text containing the words listed above, OOo will spell-check them as
        > jista and bil - without the aspostrophe and dash.

...
        My understanding is that we use the icu word boundary iterator to split
        up a sentence into words that we then give to the spell-checker.
        
        The default rules are described at
        http://userguide.icu-project.org/boundaryanalysis#TOC-Word-Boundary
        
        so..., I think your problem is that these rules would allow e.g.
        FOO'BAR and FOO-BAR but will split FOO' as FOO + '
        
        I'm not altogether sure if the correct solution is to talk to the icu
        people (http://site.icu-project.org/) about getting Maltese rules for
        word boundary improved/fixed in icu. Or if the correct solution is to
        make custom icu-rules and stick them into OOo to over-ride those
        defaults for your language.
...

Comment 3 Caolán McNamara 2011-03-24 06:28:09 UTC

Seems Ramon's latest work was http://linux.org.mt/downloads/spellcheck/speller-11.zip which is now converted to hunspell format, but unfortunately there's no licence notice there on it

Comment 4 Björn Michaelsen 2011-12-23 11:48:37 UTC

[This is an automated message.]
This bug was filed before the changes to Bugzilla on 2011-10-16. Thus it
started right out as NEW without ever being explicitly confirmed. The bug is
changed to state NEEDINFO for this reason. To move this bug from NEEDINFO back
to NEW please check if the bug still persists with the 3.5.0 beta1 or beta2 prereleases.
Details on how to test the 3.5.0 beta1 can be found at:
http://wiki.documentfoundation.org/QA/BugHunting_Session_3.5.0.-1

more detail on this bulk operation: http://nabble.documentfoundation.org/RFC-Operation-Spamzilla-tp3607474p3607474.html

Comment 5 Florian Reisinger 2012-08-14 14:01:20 UTC

Dear bug submitter!

Due to the fact, that there are a lot of NEEDINFO bugs with no answer within the last six months, we close all of these bugs.

To keep this message short, more infos are available @ https://wiki.documentfoundation.org/QA/NeedinfoClosure#Statement

Thanks for understanding and hopefully updating your bug, so that everything is prepared for developers to fix your problem.

Yours!

Florian

Comment 6 Florian Reisinger 2012-08-14 14:02:26 UTC

Dear bug submitter!

Due to the fact, that there are a lot of NEEDINFO bugs with no answer within the last six months, we close all of these bugs.

To keep this message short, more infos are available @ https://wiki.documentfoundation.org/QA/NeedinfoClosure#Statement

Thanks for understanding and hopefully updating your bug, so that everything is prepared for developers to fix your problem.

Yours!

Florian

Comment 7 Florian Reisinger 2012-08-14 14:07:02 UTC

Dear bug submitter!

Due to the fact, that there are a lot of NEEDINFO bugs with no answer within the last six months, we close all of these bugs.

To keep this message short, more infos are available @ https://wiki.documentfoundation.org/QA/NeedinfoClosure#Statement

Thanks for understanding and hopefully updating your bug, so that everything is prepared for developers to fix your problem.

Yours!

Florian

Comment 8 Florian Reisinger 2012-08-14 14:09:07 UTC

Dear bug submitter!

Due to the fact, that there are a lot of NEEDINFO bugs with no answer within the last six months, we close all of these bugs.

To keep this message short, more infos are available @ https://wiki.documentfoundation.org/QA/NeedinfoClosure#Statement

Thanks for understanding and hopefully updating your bug, so that everything is prepared for developers to fix your problem.

Yours!

Florian