Bug 139523 - [GRAMMAR CHECKER] LightProof makes Python complain on FutureWarnings for pt-BR
Summary: [GRAMMAR CHECKER] LightProof makes Python complain on FutureWarnings for pt-BR
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Linguistic (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium normal
Assignee: Julien Nabet
URL:
Whiteboard: target:7.3.0
Keywords:
Depends on:
Blocks:
 
Reported: 2021-01-09 22:15 UTC by Olivier Hallot
Modified: 2021-09-26 19:08 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Olivier Hallot 2021-01-09 22:15:33 UTC
I get some FutureWarnings from Python when using LightProof for pt-BR in master build, messages are


/home/tdf/git/core/instdir/share/extensions/dict-pt-BR/pythonpath/lightproof_impl_pt_BR.py:513: FutureWarning: Possible nested set at position 13

  vrenuncia = re.compile("(?i)\\b([Aa]|[[\xc0\xe0]|[Aa]nuncia|[Aa]nunciava|[Aa]nunciam|[Aa]nunciou|[Cc]om|[Cc]omo|[Cc]oncedeu|[Cc]uja|[Dd]a|[Dd]e|[Dd]esaprova|[Dd]oce|[Dd]upla|[Ee]|[\xc9\xe9]|[Ee]m|[Ee]ssa|[Ee]st\xfapida|[Ee]ventual|[Ff]azer|[Ff]requente|[Gg]enerosa|[Hh]ouve|[Ii]mediata|[Ii]mpliquem|[Ii]mporta|[Ii]mportar\xe1|[Mm]inha|[Nn]a|[Nn]ega|[Nn]ossa|[Nn]uma|[Oo]u|[Pp]ela|[Pp]equena|[Pp]osterior|[Pp]resumir|[Pp]romove|[Pp]romover|[Pp]ura|[Rr]epresentaram|[Ss]ignifique|[Ss]ua|[Tt]ua|[\xda\xfa]ltima|[\xda\xfa]nica|[Uu]ma) renuncia\\b")


/home/tdf/git/core/instdir/share/extensions/dict-pt-BR/pythonpath/lightproof_impl_pt_BR.py:516: FutureWarning: Possible nested set at position 13

  vdenuncia = re.compile("(?i)\\b([Aa]|[[\xc0\xe0]|[Aa]lguma|[Aa]p\xf3s|[Aa]presenta|[Aa]presentam|[Aa]presentar|[Aa]presentaram|[Aa]presente|[Aa]presentem|[Aa]presentou|[Aa]pura|[Aa]purando|[Aa]purar|[Aa]quela|[Aa]purou|[Aa]ssunto|[Cc]lara|[Cc]om|[Cc]omo|[Cc]onfirma|[Cc]onfirmam|[Cc]onforme|[Cc]ovarde|[Cc]uja|[Dd]a|[Dd]ar|[Dd]as|[Dd]e|[Dd]esmente|[Dd]esmentem|[Dd]esmentiu|[Dd]essa|[Dd]uma|[Ee]|[\xc9\xe9]|[Ee]m|[Ee]ncaminha|[Ee]ncaminham|[Ee]ncaminhou|[Ee]ngolir|[Ee]spec\xedfica|[Ee]ssa|[Ee]sta|[Ee]xista|[Ee]xistiu|[Ee]xistindo|[Ff]alsa|[Ff]ormaliza|[Ff]ormalizando|[Ff]ormalizaram|[Ff]ormalizou|[Ff]ormulada|[Gg]rande|[Gg]rave|[Hh]\xe1|[Hh]avia|[Hh]ouve|[Hh]ouver|[Ii]nexplicada|[Ii]ng\xeanua|[Ii]nvestiga|[Ii]nvestigam|[Ii]nvestigar|[Ii]nvestigava|[Jj]ulga|[Jj]ulgam|[Jj]ulgou|[Ll]evar|[Mm]as|[Mm]ediante|[Mm]eia|[Mm]uita|[Nn]a|[Nn]\xe3o|[Nn]enhuma|[Nn]ova|[Nn]uma|[Oo]ferece|[Oo]ferecer|[Oo]fereceu|[Oo]u|[Oo]utra|[Pp]ela|[Pp]or|[Pp]oss\xedvel|[Pp]reciosa|[Pp]resente|[Pp]rimeira|[Qq]ualquer|[Rr]ecebe|[Rr]eceberam|[Rr]eceberem|[Rr]ecebeu|[Ss]egunda|[Ss]egundo|[Ss]imples|[Ss]obre|[Ss]ua|[Tt]em|[Tt]err\xedvel|[Tt]oda|[Tt]remenda|[Uu]ma|[Vv]elada) denuncia\\b")

/home/tdf/git/core/instdir/share/extensions/dict-pt-BR/pythonpath/lightproof_impl_pt_BR.py:217: FutureWarning: Possible nested set at position 20
  i[0] = re.compile(i[0])

Is there a way to fix these warnings?
Comment 1 elmau 2021-07-29 15:42:02 UTC
The razor is: 

Support for nested sets and set operations in regular expressions as in Unicode Technical Standard #18 might be added in the future. This would change the syntax. To facilitate this future change a FutureWarning will be raised in ambiguous cases for the time being. That include sets starting with a literal '[' or containing literal character sequences '--', '&&', '~~', and '||'. To avoid a warning, escape them with a backslash. (Contributed by Serhiy Storchaka in bpo-30349.)

https://docs.python.org/dev/whatsnew/3.7.html

so... 

vrenuncia = re.compile("(?i)\\b([Aa]\|\[[\xc0\xe0]|[Aa]nuncia|[Aa]nunciava|[Aa]nunciam|[Aa]nunciou|[Cc]om|[Cc]omo|[Cc]oncedeu|[Cc]uja|[Dd]a|[Dd]e|[Dd]esaprova|[Dd]oce|[Dd]upla|[Ee]|[\xc9\xe9]|[Ee]m|[Ee]ssa|[Ee]st\xfapida|[Ee]ventual|[Ff]azer|[Ff]requente|[Gg]enerosa|[Hh]ouve|[Ii]mediata|[Ii]mpliquem|[Ii]mporta|[Ii]mportar\xe1|[Mm]inha|[Nn]a|[Nn]ega|[Nn]ossa|[Nn]uma|[Oo]u|[Pp]ela|[Pp]equena|[Pp]osterior|[Pp]resumir|[Pp]romove|[Pp]romover|[Pp]ura|[Rr]epresentaram|[Ss]ignifique|[Ss]ua|[Tt]ua|[\xda\xfa]ltima|[\xda\xfa]nica|[Uu]ma) renuncia\\b")
Comment 2 Julien Nabet 2021-07-30 07:29:18 UTC
I gave a try with https://gerrit.libreoffice.org/c/dictionaries/+/119693
Comment 3 Commit Notification 2021-07-30 18:38:20 UTC
Julien Nabet committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/dictionaries/commit/8cd20580b995fc6d93fbb1cdabd329a26c6ec85c

tdf#139523: LightProof makes Python complain on FutureWarnings for pt-BR
Comment 4 Julien Nabet 2021-09-26 19:08:53 UTC
It should be ok now.
If I'm wrong or forgot something, don't hesitate to reopen this tracker.