Bug Hunting Session
Bug 57528 - Writer detects hebrew as hindi.
Summary: Writer detects hebrew as hindi.
Status: RESOLVED DUPLICATE of bug 39935
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
3.5.4 release
Hardware: x86 (IA32) Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: RTL-Hebrew
  Show dependency treegraph
 
Reported: 2012-11-25 19:52 UTC by Faker Title
Modified: 2017-10-19 23:10 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Faker Title 2012-11-25 19:52:17 UTC
Open Writer, type in some hebrew, Writer automatically detects language as Hindi (can be seen both at bottom of screen and in Tools->Language->For any) and subsequently doesn't do any spellchecking.
Comment 1 Faker Title 2012-11-25 22:26:35 UTC
Installed LibreOffice Beta 3.6.4.1 to see if this persists.
It does.
Comment 2 Faker Title 2012-12-08 13:43:28 UTC
The non beta release 3.6.4.3 has the same symptom.
Comment 3 Urmas 2012-12-08 14:49:14 UTC
Linux has no concept of 'languages'.
Comment 4 Lionel Elie Mamane 2013-01-08 07:12:06 UTC
LibreOffice has a concept of language, and indeed when typing Hebrew in a blank document that has "English" or "French" as language, LibreOffice switches to Hindi. Reopening/confirming (with 3.5.4.2 Debian package)

This can be changed using "Tools->Language->For all text", but it is still a bug that the autosniffing gets it so spectacularly wrong on a simple case (it is not like Hindi and Hebrew share the same alphabet...)
Comment 5 Urmas 2013-01-08 08:14:27 UTC
If you are going to type in Hebrew, change the ME language to Hebrew. That's what options are for. As I said earlier, Linux has no mechanism to tell the language of typed text to applications.
Comment 6 Lionel Elie Mamane 2013-01-08 08:50:21 UTC
(In reply to comment #5)
> If you are going to type in Hebrew, change the ME language to Hebrew. That's
> what options are for. As I said earlier, Linux has no mechanism to tell the
> language of typed text to applications.

LibreOffice makes a guess, and that guess is wrong. I start with a document in ENGLISH, then type Hebrew into it. If LibreOffice treated that as "English", OK, I would understand, it just keeps the language of the document and I need to change that, exactly like when I type French into a document that is declared as English.

However, in this situation, LibreOffice does not stay on "English". It AUTOMATICALLY AND SILENTLY switches to Hindi. So *clearly* it is making some guess / automatic sniffing / ..., probably based on the Unicode range of the characters typed. The Hebrew range should be sniffed as Hebrew, the Arabic range as Arabic, etc.

I don't understand why you repeat "Linux has no mechanism to tell the language of typed text to applications". Why would you expect it to, and why do you excuse LibreOffice's wrong autosniffing on that? And how the hell would GTK-QT-XLIB/X11/GNU/Linux magically know the language anyway?

I understand LibreOffice cannot easily sniff the difference between French and English (they mostly share the same alphabet, so one would have to use possibly brittle methods such as letter statistics or heuristics such as "the language that gives the least spelling errors on a spell check"), but Hindi and Hebrew... Different Unicode range.

Either DO NOT AUTO-SNIFF THE LANGUAGE AT ALL, or at least get the *trivial*/easy cases right.
Comment 7 Urmas 2013-01-08 21:53:05 UTC
When you enter CTL characters, they will be marked with the selected CTL language. You have Hindi selected as a default CTL language for your documents. Select Hebrew instead and stop reopening this bug.
Comment 8 Lionel Elie Mamane 2013-01-20 09:09:19 UTC
The(In reply to comment #7)
> When you enter CTL characters, they will be marked with the selected CTL
> language.

The point is that the user didn't select any CTL language, neither locally
in the document, nor globally as a "default for all documents". LibreOffice
selected one all by itself.

Now, I understand that Hindi is some kind of hardcoded global default,
but that is a poor service for users of *other* CTL languages.

Why not have the hardcoded default as "none" or "unknown" or "automatic",
and then (when the default language is that) autosniff from the Unicode range,
in the cases where it is so simple as having one language <-> one range
(because only that language - or maybe primarily that language - uses that
alphabet)? I understand this probably cannot be done for all languages
(don't e.g. Korean and Chinese share a significant portion of their
Unicode ranges?); if as an extra gravy we can "guess" from a more refined
test (statistical letter/character distribution? whichever language gives
the least spelling errors?), then that's even better, but let's first
solve the simple cases, OK?
Comment 9 Urmas 2013-01-20 10:41:57 UTC

*** This bug has been marked as a duplicate of bug 39935 ***