Bug Hunting Session
Bug 57458 - Improve Greek spelling/grammar checking
Summary: Improve Greek spelling/grammar checking
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Linguistic (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium enhancement
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Spell-Checking
  Show dependency treegraph
 
Reported: 2012-11-23 13:18 UTC by Michael
Modified: 2017-09-02 22:43 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michael 2012-11-23 13:18:01 UTC
All word-processors I know (inc MS Word) fail to take into account certain important Greek grammar rules, to the frustration of native Greek writers, particularly professional proofreaders who have to spellcheck manually or ignore the rules. This leads to certain mistakes now commonly cropping up in publications. This is an opportunity for LibreOffice to gain a unique advantage. Eg:

    the letter "n" ("ν") at the end of articles is dropped when the following word starts with certain letters (which a writer has to memorise!).

    "ό,τι" is actually a word, and it is different from "ότι".
    in certain circumstances, longer words get two stresses (eg "η παράγραφός μου" is correct).
    capitalised words never get a stress (eg "ΌΠΩΣ" is wrong)

If anyone is interested, I can provide more details on the actual rule.

Thanks.
Comment 1 Roman Eisele 2012-11-29 11:03:04 UTC
My knowledge of Modern Greek is very limited (I am more familiar with Classical Greek ;-), but even such limited knowlegde is sufficient to confirm that this is a valid and very reasonable enhancement request.

Some of the points mentioned by Michael are really simple and basic, and it is difficult to understand why word processors do not even know about ό,τι or the rule that capizalized words never get any accents in Greek (there is no difference between Classical and Modern in such cases) ...

So I hope that at least some of these improvements should be not so diffcult to implement.
Comment 2 Roman Eisele 2012-11-29 11:09:43 UTC
@ Andras Timar, László Németh:

Hi Andras and László,
this is a valuable enhancement request. A general question: what is the right way to proceed with such general requests about spelling/grammar checkers? Is there anything we can do to fulfill the request at least partially, or should/can we delegate it, etc.? Are parts of this request potential EasyHacks? Or should this go to Lightproof/Hunspell, etc.?
Comment 3 László Németh 2012-11-30 09:33:31 UTC
Michael, Roman, I will add these rules to the LibreOffice with your help using Lightproof. I need a precise description about the rules and their exceptions (irregular etc. forms), and we could test the result on Greek Wikipedia or other texts.

Checking this capitalization problem is a simple task with Lightproof (maybe except the capitalization by character formatting), so I will attach an example soon. Thanks for your help in advance. László
Comment 4 Roman Eisele 2012-12-01 09:59:29 UTC
@ László:
Wow - thank you very much for your answer and for your willingness to implement these rules!


@ Michael:
This is a great opportunity to get the first word processor which fulfills (at least partially) your needs ;-) So, can you please take the task to provide a list of the most important rules for László?

I am sorry but I can not help much here, because (as said above) I am familiar with Classical Greek, but have only a limited (passive/receptive) knowledge of modern Greek; so the danger is that I would provide slightly wrong or just unnecessary rules. You know, some rules are the same, others are not ...

Maybe you can get some aid from the Greek LibreOffice user group; e.g., write to the Greek mailing lists (see http://wiki.documentfoundation.org/Local_Mailing_Lists#Greek) and ask for collaborators. Maybe even the ΕΛ/ΛΑΚ (http://ellak.gr/) could be interested to help here? Or the municipality of Πυλαίας-Χορτιάτη (http://www.pilea-hortiatis.gr/), who has moved this year most of their PCs to LibreOffice and should be interested in getting good spelling/grammar checking, too ...

But these are just some ideas; you will probably know much better than me how to get additional help :-)

So, thank you both in advance for getting an advanced Greek spelling/grammar Checker ;-)
Comment 5 Michael 2012-12-01 15:00:19 UTC
Thanks for all the help. I am happy to help out, but I have a very tight deadline, so I will get back to this in a few days. For now, I can say 
that as you know, Greek grammar can be a little complex and although a native Greek speaker like myself may be aware of the rules or simply applies them instinctively, it is not always easy to explain them in a mechanistic way. Also, often it is not clear if an issue should be dealt with as a grammar or spelling issue. The double-stresses issue I mentioned earlier is difficult to explain as a grammar rule, but probably easier to detect as a spelling mistake (more on this later).

I really think a Greek language professional (teacher, proofreader etc) should be involved. I will try to think about this one when I come up for air. 

I am a native Greek speaker and I take languages skills seriously because as an academic I consider language my tool, even if my English lets me down at times. However, I am based in the UK and I am not involved in teaching the Greek language or anything to do with linguistics, so my usefulness might be limited. I do have a Greek grammar book though :)

Also, I should emphasise again that these improvements would be a unique advantage. It is shocking really there has been this gap in the market for so long, considering eg the resources Microsoft throws at its Office suite. The peculiar result is that the rules changed informally, ie they are often ignored even by professionals! But those who do care (often including opinion leaders), appreciate good grammar when they see it. 

I am sure I can find more rules when I have the time, but for now let me explain the rules I mentioned. 

1. "ό,τι", "Ό,τι", "Ο,ΤΙ" are correct versions of the same word. This word actually exists and is very common. It translates as 'what' as in 'What(ever) we said...'. It is often confused (even by many Greeks and all word processors) with "ότι", "Ότι", "ΟΤΙ", which means "that" as in "they said that...".

2. As I said earlier, (fully) capitalised words never get a stress (eg "ΌΠΩΣ" is wrong). But if only the first letter is capitalised they do get a stress. Eg "Όπως" and "Αφού" get a stress. Note that in "Όπως", the initial capitalised letter gets a stress (there is a stress on "O" in case you missed it). 

3. Certain words ending in "ν" (specifically: τον (as an article), την, έναν, αυτήν, την, δεν, μην) lose that "ν" when the following word begins with γ, β, δ, χ, φ, θ, μ, ν, λ, ρ, σ, ζ. They don't lose it when the following word begins with anything else really (a vowel, κ, π, τ, μπ, ντ, γκ, τσ, τζ, ξ, ψ). 

The tricky point: "τον" (as an article) loses it as above, but "τον" as a personal pronoun doesn't. That's the only exception mentioned in my grammar book.

All these letters  must be memorised by the writer! Most people remember only the obvious ones. Books and newspapers are now full of mistakes because word processors don't know the rule. 

4. Nouns with a stress on the third syllable from the end, get a second stress on the last syllable when (and only when) followed by certain words. In Greek the equivalent of a possessive pronoun follows the noun. Because nouns and these words are linked together when pronounced, the stress shifts from the trailing word to the last syllable of the noun because it sounds better. Eg "η παράγραφος" and "η παράγραφός μου" -- both are correct. The trick here is that "μου" in this example may not be a possessive  pronoun, so "η παράγραφoς μου" (pronounced as two separate words this time) means something different and can also be correct. I can't think of a way to distinguish the two cases that doesn't involve understanding the actual meaning. This also applies to the exception in rule 3 above. Unless an engine is developed that understands meaning or more complex patterns, I suppose one way is to speculate. So, a spellchecker could accept both "η παράγραφός μου" and "η παράγραφoς μου" as potentially correct (or flag them up as potentially wrong as in "user beware"). 

That's it for now. I hope all this makes sense. Let me know if you need any help. I think Roman's contribution would be important because he can probably see patterns I can't. Being a native Greek speaker blinds, I can't easily distance myself and see the language as a non-Greek does. 

Michael
Comment 6 Urmas 2014-02-20 15:24:14 UTC
A small note: removing accents in ΌΠΩΣ is the font's duty. Capitalized words should be stored in their normative spelling, with accents.