Description: I'm using LibreOffice 6.4.3.2 (latest Ubuntu stable version) in the Spanish/Spain (es-ES) locale and finding a very strange bug when changing language. I'll try to be as clear as possible in the description. To change language you have to open the Character dialog in the Font tab (hopefully I'm guessing the names right), then there's a combobox where you can select the language. It used to be that one could write in the first few letters of the desired language, and it would autocomplete to the first matching language in the list. For the most part, it still works, but there is a catch. If you write "in" as the first two letters, it gets changed to "id" for some reason. Since English is written "Inglés" in Spanish, this is rather annoying. The change to "id" happens regardless of case. By experimenting I have found another weird change - "iw" gets changed to "he". In practical terms, this means that you can't pick English as a language when using the Spanish locale by using the keyboard to search the list. It works OK when using the mouse, though. This is a fairly recent regression (probably on the update from Ubuntu 18.04 LTS to 20.04). I have reproduced the bug by restarting in safe mode, as well as using LC_ALL=C to reset LO to the default English locale. The same substitutions happen. Steps to Reproduce: 1. Select some text 2. Right click > Character > Font tab and focus the language selector 3. Type "IN" or "IW" Actual Results: The "IN"/"IW" input gets changed to "id"/"he" respectively Expected Results: The "IN"/"IW" input remains and auto-selects the first matching language Reproducible: Always User Profile Reset: Yes Additional Info: Versión: 6.4.3.2 Id. de compilación: 1:6.4.3-0ubuntu0.20.04.1 Subprocs. CPU: 8; SO: Linux 5.4; Repres. IU: predet.; VCL: gtk3; Configuración regional: es-ES (es_ES.UTF-8); Idioma de IU: es-ES Calc: threaded
Xisco, is it possible for you to check this? I don't have spanish UI?
Dieter, The bug was also triggered with LC_ALL=C, apparently it is not locale-dependent. Just select some text > right click > Character > Character... then click on the Language combobox entry area. Delete the contents and try to write "IN" or "IW". That triggers the bug for me.
(In reply to andvaranaut@gmail.com from comment #2) > Dieter, > > The bug was also triggered with LC_ALL=C, apparently it is not > locale-dependent. > > Just select some text > right click > Character > Character... then click on > the Language combobox entry area. Delete the contents and try to write "IN" > or "IW". That triggers the bug for me. Thanks for clarification. I tested with the wrong dialog. My results in = Indonesia (expected) iw = he (not expected) Expected result: Perhaps like behaviour in Tools => Options => Language Settings => Languages: typing iw gives result Walloon (because there is no language with iw at the beginning
(In reply to andvaranaut@gmail.com from comment #0) > This is a fairly recent regression (probably on the update from Ubuntu 18.04 > LTS to 20.04). I can (mostly) reproduce with 6.2.8 on Windows: Version: 6.2.8.2 (x64) Build ID: f82ddfca21ebc1e222a662a32b25c0c9d20169ee CPU threads: 2; OS: Windows 10.0; UI render: default; VCL: win; Locale: zh-CN (zh_CN); UI-Language: en-US Calc: threaded Will test earlier versions later. > Actual Results: > The "IN"/"IW" input gets changed to "id"/"he" respectively > > Expected Results: > The "IN"/"IW" input remains and auto-selects the first matching language Like Dieter, when I type two letters "in" it autocompletes to "Indonesia", however if I then press Backspace key, it deletes the "donesia" part, and the text changes to "id". Since Indonesia's ISO code (called BCP47 or something) is "id", I suspect there is some mix-up between the names and ISO codes. For "iw" I reproduce the reported, changing to "he" behavior.
Also reproducible with 5.2.7 (the oldest version I have): Version: 5.2.7.2 (x64) Build ID: 2b7f1e640c46ceb28adf43ee075a6e8b8439ed10 CPU Threads: 2; OS Version: Windows 6.19; UI Render: default; Locale: zh-CN (zh_CN); Calc: group As 5.2.7 should be older than Ubuntu 18.04 I doubt this is a regression as the reporter claimed. It's also not limited to the Format Characters dialog, the Font tab of Paragraph Style dialog has the same problem.
Ming Hua, While in my experience the behavior is a regression (I noticed it because something I used to be able to do did not work anymore), the underlying cause might as well not be. If your hunch regarding the confusion between codes and names is correct, then there is some chance that locale does play into it. I have rechecked, however, and I'm definitely seeing the IN->ID change with both es_ES and C locale (set by invoking LC_ALL=C lowriter) in my current version (6.4.6.2 from Ubuntu 20.04), even though "Indonesian" is an option in the dropdown.
Still present in Version: 7.5.0.0.alpha1+ (X86_64) / LibreOffice Community Build ID: 52c75986adc2b370eb55ce918ab1db0a95831c83 CPU threads: 4; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win Locale: en-US (de_DE); UI: en-GB Calc: CL threaded Steps 1. Format -> Character -> Font tab 2. In language field type "iw" Actual result Change to "he" Expected result Perhaps like behaviour in Tools => Options => Language Settings => Languages: typing iw gives result Walloon (because there is no language with iw at the beginning
Dear andvaranaut@gmail.com, To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the information from Help - About LibreOffice. If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice. Please DO NOT Update the version field Reply via email (please reply directly on the bug tracker) Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) If you want to do more to help you can test to see if your issue is a REGRESSION. To do so: 1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from https://downloadarchive.documentfoundation.org/libreoffice/old/ 2. Test your bug 3. Leave a comment with your results. 4a. If the bug was present with 3.3 - set version to 'inherited from OOo'; 4b. If the bug was not present in 3.3 - add 'regression' to keyword Feel free to come ask questions or to say hello in our QA chat: https://web.libera.chat/?settings=#libreoffice-qa Thank you for helping us make LibreOffice even better for everyone! Warm Regards, QA Team MassPing-UntouchedBug
It is done on purpose in https://github.com/LibreOffice/core/blame/master/svx/source/dialog/langbox.cxx#L515
(In reply to Andreas Heinisch from comment #9) > It is done on purpose in > https://github.com/LibreOffice/core/blame/master/svx/source/dialog/langbox. > cxx#L515 I see. Many thanks for figuring this out. I can see the idea behind the substitution (allowing users to choose a language by entering its BCP47 abbreviation but using always the preferred code); however, I would say that the UX experience is dismal. Besides not being easily (or at all) discoverable, changing user input in surprising ways is kind of a big no-no, particularly as text is being entered. I would suggest not making the substitution right away, but once the user has finished entering text, meaning either on blur or on save. Given that the functionality is pretty obscure, I think that making it on save would be the best option. That would mean removing lines 518-523 and moving them to the dialog saving (I guess somewhere in SvxLanguageBox::SaveEditedAsEntry, perhaps right after the initial m_eEditedAndValid check), along with a call to LanguageTag::isValidBcp47 like the one in line 516. The current validity checks should stay as they are to give the user visual feedback regarding the validity (or not) of what is entered in the combobox, but the substitution would not happen until saving.
I'm against moving the language tag canonicalization/substitution to Save because it would make it even more obscure changing the input without any visible feedback, and it is possible for the user to continue entering a more complex language tag string that if canonicalization wouldn't kick in early might be wrong. Also note that this bug talked about two different things, one is typing "in" that should match "Inglés" in Spanish UI and for some reason did not and was changed to "id" instead; no adhoc idea why because BCP47 language tag recognition only kicks in when no matching language list entry was found. Once language tag recognition was hit though then the old ISO 639-1 code "in" is correctly changed to "id" for Indonesian, there is no "id" code. The other is the "iw" -> "he" Hebrew substitution that is expected for ISO 639-1 changed it and there is no "iw" code.
> Expected result > Perhaps like behaviour in Tools => Options => Language Settings => > Languages: typing iw gives result Walloon (because there is no language with > iw at the beginning We may stick to this behaviour?
(In reply to Eike Rathke from comment #11) > Also note that this bug talked about two different things, one is typing > "in" that should match "Inglés" in Spanish UI and for some reason did not > and was changed to "id" instead; no adhoc idea why because BCP47 language > tag recognition only kicks in when no matching language list entry was > found. Once language tag recognition was hit though then the old ISO 639-1 > code "in" is correctly changed to "id" for Indonesian, there is no "id" code. I can see how what you describe would solve all issues - however, the behavior you described does not match what is actually happening. I'm not familiar with the inner workings of the text widget, but I would assume that the problem is that the BCP47 tag recognition is being triggered before checking for a matching language in the listbox, or that the text added by the listbox matching (eg. if I type "I" it completes to "Ilocano" with "locano" selected) is somehow not taken into account at some point of the process. I'm assuming there's no way that LanguageTag::isValidBcp47 could convert "Inglés (Australia)" (first matching entry for "in") to "id", right? (I would expect the function to extract everything up to the first hyphen, if any, then try and match with the BCP47 codes). In that case, the only reasonable explanation is that aStr does not contain the completed language.
(In reply to Eike Rathke from comment #11) Ok, I have played some more with the widget and have a theory, although I would need somebody more versed in the LO codebase to weigh in. > "in" that should match "Inglés" in Spanish UI and for some reason did not > and was changed to "id" instead; no adhoc idea why because BCP47 language I started by checking the behavior of the language selector with English locale (LC_ALL=C) and it's the same - typing 'in' gets immediately changed to 'id' even though it should match "Indonesian" by the same logic. But if you happen to be able to type "Ind" (eg. by copying and pasting it) it does get autocompleted to "Indonesian". Meaning that the Spanish locale is not part of the problem. I then tried to follow the overall flow of the code, but sadly I'm nowhere near proficient in C++ enough to understand where the combobox logic is defined (in particular, I would think that rControl.find_text(aStr) has to return -1 for the BCP47 substitution to happen, but I can't find the definition of find_text anywhere). However, that gave me an idea: What might be happening is that the modification of the text that is part of the autofilling is triggering two different edit events, one with the full text and one without. So when you enter "in" it does get autocompleted to "Indonesian", "Inglés" or whatever, but internally that triggers another change event with just "in" as the contents which then gets changed to "id". You can reproduce a very similar behavior by doing the following: 1) Type _in in the combobox (does not exist) 2) Select the underscore 3) Type anything, even another underscore 4) Whatever you typed gets moved to the end of the string and 'in' gets again changed to 'id'. So if you typed another underscore you would see 'id_'. To me, that implies that the substitution is not atomic - there is a moment where the selected text is deleted before being replaced with whatever you type next, and that triggers a change event where the corresponding logic sees 'in' and changes it to 'id'. Maybe the autofilling is working in a similar way? If that is a (relatively) recent change/regression in the widget behavior, that would explain why I had the impression that the change was recent, even though the underlying code seems to have had no significant changes in 7+ years. (PS: You can use anything else instead of an underscore. I have used the underscore because there's a surprising amount of letters which form a valid three-letter BCP47 code when followed by 'in' and didn't want the code to be valid just in case, but you can put anything you want and the same happens.)
SvxLanguageBox::ChangeHdl() is called for the first character typed (here "i") and rControl.find_text(aStr) returns -1 because (anonymous namespace)::GtkInstanceComboBox::find_text() via find_text_including_mru() searches for an exact match, not start of string, then the following LanguageTag::isValidBcp47() is called which for "i" is not valid, and the function returns. Then (anonymous namespace)::GtkInstanceComboBox::auto_complete() via idleAutoComplete() selects the first matching "Icelandic" entry and through signal SvxLanguageBox::ChangeHdl() is called again, this time GtkInstanceComboBox::find_text() finding the exact match. Same first call of SvxLanguageBox::ChangeHdl() happens if "in" was typed, but then in that round (no exact match) LanguageTag::isValidBcp47() does the canonicalization. There we additionally would need a partial starting match if a full match was not found and select that.
Turned out to be more sophisticated.. my current stab at this partially works, but not fully, pushed to Gerrit to not lose it, maybe someone has an idea (quirks in commit message); if interested see there https://gerrit.libreoffice.org/c/core/+/180966
Many thanks for tackling the bug, sorry it's taking longer than anticipated. Since you're asking for suggestions... I had one, but I refrained from saying it earlier because I don't really know what I was talking about :) In a nutshell: would it be possible to call (anonymous namespace)::GtkInstanceComboBox::auto_complete() (in other words, force an autocompletion) before attempting to perform the substitution? The cleanest way I can see it working is forcing an autocompletion either before the rControl.find_text(aStr); call in line 474, or right after it and then fetching the value of rControl.get_active_text() again. Calling get_active_text() both before and after would allow you to check whether the new value is different to the previous value of aStr to know if an autocompletion has taken place, in case you need it to avoid infinite loops and such. At first glance, it probably isn't necessary - I think the worst that could happen if you don't account for this is that the validation logic would be run twice with the autocompleted text. But, if all autocompletions trigger a change event, you could just return if the new and previous values for get_active_text() differ - the new change event with the fully autocompleted text will do its thing. This is dependent on a number of things which I'm not sure apply, most notably auto_complete being synchronous and callable in that situation. But maybe it can help you work around the issue? The main issue I can find with the idea as explained is that I don't see it possible to enter 'in' by itself - it would always get autocompleted if the autocompletion is unconditional. But, since idle_autocomplete() can apparently distinguish whether to trigger an autocompletion or not (perhaps skipping it when the user is deleting text, or something similar), maybe the same logic can be used here.