Bug 143507 - Grammar Checking (Portuguese) settings page should be "localized" to English
Summary: Grammar Checking (Portuguese) settings page should be "localized" to English
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Localization (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium normal
Assignee: Julien Nabet
URL:
Whiteboard: target:7.3.0
Keywords:
Depends on:
Blocks:
 
Reported: 2021-07-23 07:00 UTC by Mike Kaganski
Modified: 2021-07-27 13:46 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
Screenshot of the settings page, using en-US UI. (33.58 KB, image/png)
2021-07-23 07:00 UTC, Mike Kaganski
Details
pt-BR to en-US Lightproof UI strings (2.14 KB, text/plain)
2021-07-26 17:16 UTC, Olivier Hallot
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Kaganski 2021-07-23 07:00:31 UTC
Created attachment 173791 [details]
Screenshot of the settings page, using en-US UI.

Installing Portuguese spell checking components (e.g. using MSI for Windows), and using en-US UI, there's a "Grammar Checking (Portuguese)" settings page under Options->Language Settings. It has settings on it in Portuguese, not in English. The settings should be in English when using en-US user interface language.

Tested with Version: 7.2.0.1 (x64) / LibreOffice Community
Build ID: 32efc3b7f3a71cfa6a7fa3f6c208333df48656cc
CPU threads: 12; OS: Windows 10.0 Build 19043; UI render: Skia/Raster; VCL: win
Locale: ru-RU (ru_RU); UI: en-US
Calc: threaded
Comment 1 Julien Nabet 2021-07-24 07:49:59 UTC
In dictionaries/pt_BR/dialog, there are these files:
- OptionsDialog.xcs
- OptionsDialog.xcu
- pt_BR_pt_BR.default
- pt_BR_pt_BR.properties
- pt_BR.xdl
+ directory "registry"

I suppose we need to add pt_BR_en_US.properties with translations like in dictionaries/ru_RU/dialog

Olivier: thought you might be interested in this one.
Comment 2 Julien Nabet 2021-07-24 08:09:49 UTC
In addition we need this of course:
diff --git a/Dictionary_pt-BR.mk b/Dictionary_pt-BR.mk
index 1264578..d9d0ad2 100644
--- a/Dictionary_pt-BR.mk
+++ b/Dictionary_pt-BR.mk
@@ -27,6 +27,7 @@ $(eval $(call gb_Dictionary_add_files,dict-pt-BR,dialog,\
     dictionaries/pt_BR/dialog/pt_BR.xdl \
     dictionaries/pt_BR/dialog/pt_BR_pt_BR.default \
     dictionaries/pt_BR/dialog/pt_BR_pt_BR.properties \
+    dictionaries/pt_BR/dialog/pt_BR_en_US.properties \
 ))
 
 $(eval $(call gb_Dictionary_add_files,dict-pt-BR,pythonpath,\

Now I'm trying to translate the pt_BR file, at least the non help items but I noticed this:
numsep=Gerundismos
(related with hlp_numsep=Emprego inapropriado do ger\u00FAndio: estarei trabalhando, vou estar fazendo.).

For English checking, numsep means "Thousand separation of large numbers".
but here it's expected for gerund checking.
Does it work? I mean, "numsep" like other items would be just keywords and could be used for any grammar rule checking?
Comment 3 Julien Nabet 2021-07-24 08:19:23 UTC
I'm not expert but I typed this on Writer but I took this example from https://www.todomundopod.com/en/portuguese-gerund/:
Eu estou fazendo algo
Estou a fazer algo

Then I selected the whole text and ask LO to consider as Portuguese from Portugal then Portuguese from Brazil, no gerund error spotted by LO.

If I put "Estou a fazer algo algo" (so "algo" twice), no grammar error from Portuguese from Portugal but grammar error spotted by Portuguese from Brazil as expected.
Comment 4 Julien Nabet 2021-07-24 09:42:23 UTC
Ok I understand better.

The keywords are used in:
pt_BR/pythonpath/lightproof_pt_BR.py

I found an example to test gerund:
1799  [u'(?u)(?<![-\\w\u2013.,\xad])Vou estar (?P<Grd_1>[a-z\xE7]+[ai])[n][d][o](?![-\\w\u2013\xad])', u'= m.group("Grd_1").capitalize() + \'rei \'', u'Gerundismo. Voc\xea quis dizer:', u'not m.group(1) in excGquando and option(LOCALE,"numsep")'],

So I put:
Vou estar fazendo algo
Vou estar fazer algo


Only first one indicates a grammar problem.
Now I suppose "Estou" is a conjugated form of "estar" so it should have detected the pb too.

I tried to disable the "Gerundismos" but it's not applied with this on console:
warn:cui.options:19023:19023:cui/source/options/treeopt.cxx:2057: ExtensionsTabPage::DispatchAction(): exception of XDialogEventHandler::callHandlerMethod() com.sun.star.beans.UnknownPropertyException message: mmalmau /home/julien/lo/libreoffice/configmgr/source/access.cxx:709
warn:cui.options:19023:19023:cui/source/options/treeopt.cxx:2057: ExtensionsTabPage::DispatchAction(): exception of XDialogEventHandler::callHandlerMethod() com.sun.star.beans.UnknownPropertyException message: mmalmau /home/julien/lo/libreoffice/configmgr/source/access.cxx:724
=>another bug
Comment 5 Julien Nabet 2021-07-24 10:05:09 UTC
(In reply to Julien Nabet from comment #4)
> ...
> I tried to disable the "Gerundismos" but it's not applied with this on
> console:
> warn:cui.options:19023:19023:cui/source/options/treeopt.cxx:2057:
> ExtensionsTabPage::DispatchAction(): exception of
> XDialogEventHandler::callHandlerMethod()
> com.sun.star.beans.UnknownPropertyException message: mmalmau
> /home/julien/lo/libreoffice/configmgr/source/access.cxx:709
> warn:cui.options:19023:19023:cui/source/options/treeopt.cxx:2057:
> ExtensionsTabPage::DispatchAction(): exception of
> XDialogEventHandler::callHandlerMethod()
> com.sun.star.beans.UnknownPropertyException message: mmalmau
> /home/julien/lo/libreoffice/configmgr/source/access.cxx:724
> =>another bug

For this one, I found the pb.
Everywhere, we got the keyword "mmalmau" ("bad or bad employment" according to Google Translate) but the variable is declared "malmau".
Looking at git history of this file, I'm gonna send an email to Olivier + Raimundo Moura + Fridrich Strba.
Comment 6 Olivier Hallot 2021-07-24 13:21:19 UTC
Summary: Many features have no equivalent in the other languages grammar checkers and some makes no sense to me to translate for other languages. The only usage I foresee is someone who use non-PT ui writing/checking a pt-BR document, and the tool requires good pt-BR knowledge anyways.

(It will also create noise among translators, who will likely not understand the strings to translate)

Long story:
<intro>
The grammar checking developed by Raimundo is one of the LibreOffice jewels that the pt-BR edition has. He implemented several grammar+linguistic style checks which added a lot of value to the grammar checker tool.

Among them are "pleonasms" ("climbing up","create new", etc), much appreciated actually. Another is "gerundism", a poor PT language style borrowed from English telemarketing handbooks : "I am going to be doing...", "I am going to be sending your contract", or such ugly things...

Portugal, Brazil and other pt-* countries agreed in 1990 on unifying spelling for their pt language - Acordo Ortográfico da Língua Portuguesa 1990 - however it sparked resistance from Portugal and the agreement is poorly implemented elsewhere than Brazil. Brazilians adopted the Agreement very quickly and massively turning Raimundo's contribution one of the good things of LibreOffice for Brazilians.
</intro>

I have no details of Raimundo implementation, and my best guess is that he hacked code from other lang and adapted. "numsep" mixing with "gerundism" look as one of these. Refactoring code for better readability is beyond my skills/availability but can be a easy-hack for pt-* python coders.

I also noticed that the Python version used in LO is creating warnings on the console (bug#139523).

For the records, "malmau" refers to "mal" or "mau" (en:evil/bad or fr:mal/mauvais), a typical mistake in pt, and much less frequent mistake in other languages.

Plus... PT and ES has 2 verbs for "To be": "Ser" and "Estar"

My opinion: won't fix.
Comment 7 Julien Nabet 2021-07-25 08:46:20 UTC
(In reply to Olivier Hallot from comment #6)
> Summary: Many features have no equivalent in the other languages grammar
> checkers and some makes no sense to me to translate for other languages. The
> only usage I foresee is someone who use non-PT ui writing/checking a pt-BR
> document, and the tool requires good pt-BR knowledge anyways.
> 
> (It will also create noise among translators, who will likely not understand
> the strings to translate)
> 
> ...
Thank you Olivier for this detailed feedback.
First thing I'd like to say, just to be sure I'm not misunderstood, my goal wasn't to denigrate the Portuguese from Brasil grammar checking. Grammar checking can be quite complicated to deal with (at least it's the case for French) so I know that these tools required some time and expertise to make them work right.

About keywords unrelated to grammar rule check like "numsep", I think I can fix these so they would use the appropriate keyword. First I must find the way to commit on a git submodule since it's in "dictionaries" part.
Also I've sent an email to László Németh because I don't know if there's an upstream to fix. (it seems it's not the case considering https://cgit.freedesktop.org/libreoffice/lightproof/log/).

About "ser" and "estar", yes I knew them for Spanish (I never forgot the difference between "ser loco" and "estar loco" from a teacher, long time ago! :-)).

About translation itself and to come back at the bugtracker, do you mean Russian and Hungarian spelling part should stay respectively in Russian and Hungarian whatever the language UI used?
Comment 8 Olivier Hallot 2021-07-26 17:15:44 UTC
(In reply to Julien Nabet from comment #7)

> About translation itself and to come back at the bugtracker, do you mean
> Russian and Hungarian spelling part should stay respectively in Russian and
> Hungarian whatever the language UI used?

Attached a properties file in any case. To test, just add it to 

{instdir}/share/extensions/dict-pt-BR/dialog/

and use en-US UI.
Comment 9 Olivier Hallot 2021-07-26 17:16:49 UTC
Created attachment 173865 [details]
pt-BR to en-US Lightproof UI strings
Comment 10 Julien Nabet 2021-07-26 18:45:45 UTC
(In reply to Olivier Hallot from comment #8)
> (In reply to Julien Nabet from comment #7)
> 
> > About translation itself and to come back at the bugtracker, do you mean
> > Russian and Hungarian spelling part should stay respectively in Russian and
> > Hungarian whatever the language UI used?
> 
> Attached a properties file in any case. To test, just add it to 
> 
> {instdir}/share/extensions/dict-pt-BR/dialog/
> 
> and use en-US UI.

It works great! Do you want to submit on gerrit (with https://bugs.documentfoundation.org/show_bug.cgi?id=143507#c2) or do you prefer I do it? (whatever for me)
Comment 12 Commit Notification 2021-07-27 13:46:16 UTC
Julien Nabet committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/dictionaries/commit/66dd2540ba4e4a6442de40edf6b27895f94d9958

tdf#143507: localize Grammar Checking (Portuguese) settings page