Download it now!
Bug 91766 - Automatic language detection for spell checking
Summary: Automatic language detection for spell checking
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Linguistic (show other bugs)
Version:
(earliest affected)
4.4.4.1 rc
Hardware: All All
: medium enhancement
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
: 132294 (view as bug list)
Depends on:
Blocks: Spell-Checking Language-Detection
  Show dependency treegraph
 
Reported: 2015-05-31 04:23 UTC by Aleve Sicofante
Modified: 2021-01-15 17:09 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments
Word can't determine the language of different paragraphs. (17.89 KB, application/vnd.oasis.opendocument.text)
2015-05-31 16:47 UTC, Aleve Sicofante
Details
We have had this feature since forever. (247.31 KB, image/png)
2015-05-31 23:15 UTC, Adolfo Jayme
Details
Sample multilingual document (25.04 KB, application/vnd.oasis.opendocument.text)
2021-01-15 17:09 UTC, Adalbert Hanßen
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Aleve Sicofante 2015-05-31 04:23:02 UTC
Automatic Language Detection for spell checking is an essential tool in office environments that do international business. MS Office has been doing it for some 20 years, Google Translate has been auto detecting language forever as well. The technology is well known and reliable. I'd like to encourage the design team to put this as a priority for upcoming versions. 

It should work in all components of LibreOffice, but of course Word is the major beneficiary.

Maybe this should be an enhancement request for Hunspell?
Comment 1 Adolfo Jayme 2015-05-31 09:56:35 UTC
Ever looked at the status bar?

http://www.freedesktop.org/wiki/Software/libexttextcat/
Comment 2 Aleve Sicofante 2015-05-31 16:47:11 UTC
Created attachment 116199 [details]
Word can't determine the language of different paragraphs.
Comment 3 Aleve Sicofante 2015-05-31 16:51:27 UTC
Sorry, I meant Writer can't determine the language of different paragrpahs.
Comment 4 Aleve Sicofante 2015-05-31 16:52:13 UTC
(In reply to Aleve Sicofante from comment #2)
> Created attachment 116199 [details]
> Word can't determine the language of different paragraphs.

The attachment is an ODT document. Sorry for any confusion.
Comment 5 Adolfo Jayme 2015-05-31 23:08:20 UTC Comment hidden (obsolete)
Comment 6 Adolfo Jayme 2015-05-31 23:15:33 UTC
Created attachment 116210 [details]
We have had this feature since forever.

> Writer can't determine the language of different paragrpahs.

That assertion is simply incorrect.
Comment 7 Aleve Sicofante 2015-06-01 09:19:23 UTC
In the attached document, the first paragraph is in Spanish. The spell checking acts properly and nothing gets red-underlined.

The second paragraph is written in English. Writer doesn't seem to know that, and keeps trying to correct the paragraph as it was written in Spanish, hence the red underlining of the whole paragraph.

How is my assertion incorrect?
Comment 8 Adolfo Jayme 2015-06-06 18:14:49 UTC
Do not be confused.

What you want is to create a new feature in which Writer automatically changes the spell-checking language for each paragraph, which would be costly in long documents.

But to state that Writer “can’t determine the language of different paragraphs” is a lie, as I’ve demonstrated in the screenshot I’ve attached.
Comment 9 Aleve Sicofante 2015-06-07 00:53:58 UTC
"What you want is to create a new feature in which Writer automatically changes the spell-checking language for each paragraph, which would be costly in long documents"

If it's costly or not is open to debate (it takes a handful of words for Google Translate to detect a language, sometimes as little as two words...) but the feature has been in MS Office (including Word and Outlook) for almost two decades now, if not longer, and it's VERY useful for international businesses.

I don't know exactly what your problem is, and I don't understand your attitude either. Are you always so angry?

Yes, I propose exactly what you finally understood. I think it was clear from the beginning, but maybe I wasn't clear enough. What has no excuse, though, is your completely unnecessary aggressive tone.
Comment 10 chomisyann 2016-05-06 07:48:43 UTC
Yes please, please, add this feature.
I am working every day in English and French and sometimes in Spanish.
That s the main reason I am still using a copy of word on my PC.
In word you just need to copy past any text and it corrects it whatever is the language of the text.
It works so nicelly.

I think this is a feature more important than a new database filter or whatever geeke feature. This really makes life easier for the 90% users. (maybe at least 25% that works in several languages)


+1
Comment 11 Tyco72 2018-04-09 08:23:18 UTC
I wonder since ever why this basic feature is still not implemented in LO, and it is not the only one.
That the work of the developers focuses mainly on geeky features instead of on all the little bugs and improvements useful to the 90-99% of the users, it is the main limit/issue of the open software. But they should consider that the 90% of the $ donations to LO comes form that 90-99% of common users.
Comment 12 Xisco Faulí 2019-11-29 13:27:14 UTC
Changing priority back to 'medium' since the number of duplicates is lower than 5
Comment 13 Heiko Tietze 2020-09-14 13:11:47 UTC
*** Bug 132294 has been marked as a duplicate of this bug. ***
Comment 14 Adalbert Hanßen 2021-01-15 17:06:50 UTC
(In reply to Xisco Faulí from comment #12)
> Changing priority back to 'medium' since the number of duplicates is lower
> than 5

I was just about to make a new proposal but when entering it, I came across these duplicates. It probably would be better, to add my comment here rather than adding a new duplicate. Some of my ideas are already in the discussion above, but there are new ideas which cope with the "costly" argument. So here we go:

If you want to use the spell checker in a multilingual document, you must assign the correct languages to the different parts of the document. Without this step, larger parts of the document would be checked against the spelling rules of another language, recognized as wrong and therefore highlighted with red snake lines. 

Editing multilingual texts would become easier if you could tell the spell checker to check the spelling in all languages for which the correct longpack is installed. I suggest an additional "Automatic" option for this, which could be set at Tools>Language>... and Format>Characters>Language.

For a text passage to which this choice applies, LO Writer should check the text - for example, from the beginning of the sentence (i.e., after the last period, colon, question mark, exclamation mark, or a quotation mark) against all languages whose language pack is installed, and it should automatically assign the language that has the fewest errors in the language used in the check (minimum of characters to be underlined in red snake in that language).

If the introduction would conflict with the odt file format definition (if that does not provide a feature for automatic language selection), one could consider setting the language on the fly during editing to the one until a sentence is completed (i.e., until one of the punctuation marks mentioned).
Side question: is language actually a property of a character, i.e. a feature like font size, boldface/slash/underline, color, etc.?

Suggestion on the side: In the spelling correction as an additional choice another installed language and also "no language check"
Comment 15 Adalbert Hanßen 2021-01-15 17:09:07 UTC
Created attachment 168921 [details]
Sample multilingual document