Created attachment 122772 [details] Demonstration Writer document with macro The cInt function treats string "1,234" as being 1. This is true on 3 different operating systems on 2 different PCs, all of which are defined as being in the EN GB locale using '.' as the decimal point in the language settings (I have double-checked this). I have tested this on: ubuntu 14.04 64 bit, Version: 5.1.1.1 Build ID: 1:5.1.1~rc1-0ubuntu1~trusty1 CPU Threads: 8; OS Version: Linux 4.2; UI Render: default; Locale: en-GB (en_GB.UTF-8) (from ubuntu pre-release PPA) Windows 7 (on same PC) Version 5.0.4.2 (from Libreoffice website) xubuntu 15.04 (a 64 bit laptop), Version 5.0.5.2 Build ID: 1.5.0.5-rc2-0ubuntu2 Locale en:GB (en GB.UTF-8) (from ubuntu release PPA) I don't how far back this issue goes. It's easy to work-around, just use format to remove the ','s, but could fool people for a long time if in the middle of a complex algorithm. I attach a Writer document demonstrating this. Press the button to see what "1,234" turns into. The macro is in 'Module1', 'Main', and just says: Sub Main Dim sText as String Dim iIntText as Integer sText = "1,234" iIntText = cInt (sText) MsgBox "1,234 becomes" & iIntText End Sub I originally found this using Basic in Base. I thought I'd demonstrate it in Writer just to show it was nothing to do with Base. Or am I being really stupid?
Repro. Win 7 Pro 64-bit, Version: 5.1.0.3 (x64) Build ID: 5e3e00a007d9b3b6efb6797a8b8e57b51ab1f737 CPU Threads: 4; OS Version: Windows 6.1; UI Render: default; Locale: fi-FI (fi_FI) LibreOffice 3.5.0rc3 Build ID: 7e68ba2-a744ebf-1f241b7-c506db1-7d53735
I have tried to verify that my understanding of the correct behaviour of cInt is correct. I've not found any absolutely definitive source, although several websites recommend using it as opposed to 'Val' because it tolerates regional variations better.
(In reply to tim from comment #2) > I have tried to verify that my understanding of the correct behaviour of > cInt is correct. I've not found any absolutely definitive source, although > several websites recommend using it as opposed to 'Val' because it tolerates > regional variations better. I just found this: http://www.csidata.com/custserv/onlinehelp/vbsdocs/vbs79.htm Which claims to be from Microsoft for Visual Basic. It says: "Use the CInt function to provide internationally aware conversions from any other data type to an Integer subtype. For example, different decimal separators are properly recognized depending on the locale setting of your system, as are different thousand separators."
** Please read this message in its entirety before responding ** To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present on a currently supported version of LibreOffice (5.2.5 or 5.3.0 https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the version of LibreOffice and your operating system, and any changes you see in the bug behavior If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a short comment that includes your version of LibreOffice and Operating System Please DO NOT Update the version field Reply via email (please reply directly on the bug tracker) Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) If you want to do more to help you can test to see if your issue is a REGRESSION. To do so: 1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) http://downloadarchive.documentfoundation.org/libreoffice/old/ 2. Test your bug 3. Leave a comment with your results. 4a. If the bug was present with 3.3 - set version to "inherited from OOo"; 4b. If the bug was not present in 3.3 - add "regression" to keyword Feel free to come ask questions or to say hello in our QA chat: http://webchat.freenode.net/?channels=libreoffice-qa Thank you for helping us make LibreOffice even better for everyone! Warm Regards, QA Team MassPing-UntouchedBug-20170306
The bug is still present on 64 bit ubuntu 16.10, librefoffice: Version: 5.3.1.1 Build ID: 1:5.3.1~rc1-0ubuntu1~yakkety0 CPU Threads: 8; OS Version: Linux 4.8; UI Render: default; VCL: x11; Layout Engine: new; Locale: en-GB (en_GB.UTF-8); Calc: group All Language Settings are set to GB.
** Please read this message in its entirety before responding ** To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the information from Help - About LibreOffice. If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice. Please DO NOT Update the version field Reply via email (please reply directly on the bug tracker) Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) If you want to do more to help you can test to see if your issue is a REGRESSION. To do so: 1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from http://downloadarchive.documentfoundation.org/libreoffice/old/ 2. Test your bug 3. Leave a comment with your results. 4a. If the bug was present with 3.3 - set version to 'inherited from OOo'; 4b. If the bug was not present in 3.3 - add 'regression' to keyword Feel free to come ask questions or to say hello in our QA chat: https://kiwiirc.com/nextclient/irc.freenode.net/#libreoffice-qa Thank you for helping us make LibreOffice even better for everyone! Warm Regards, QA Team MassPing-UntouchedBug
Retested under Version: 6.0.2.1 Build ID: 1:6.0.2~rc1-0ubuntu0.17.10.1~lo1 CPU threads: 8; OS: Linux 4.13; UI render: GL; VCL: gtk2; Locale: en-GB (en_GB.UTF-8); Calc: group The behaviour is unchanged. 1,234 is still transformed to be 1 (rather than 1234), regardless of the fact that I am in the UK, with UK settings.
The bug is still present on: Version: 6.1.5.2 Build ID: 1:6.1.5~rc2-0ubuntu0.18.10.1~lo3 CPU threads: 8; OS: Linux 4.18; UI render: default; VCL: x11; Locale: en-GB (en_GB.UTF-8); Calc: group threaded
Andreas Heinisch committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/4fdc90c51e6a1bbb83c1f1826ad5b90dc1ff0ad6 tdf#97983 - Added localization for numeric types It will be available in 6.5.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Andreas: thanks for working on this. However, I still get "1,234 becomes1" when I press the button in the example document. Arch Linux 64-bit Version: 6.5.0.0.alpha0+ Build ID: 7e403195e574be5174815a51cf5c42f06f76a87a CPU threads: 8; OS: Linux 5.3; UI render: default; VCL: kf5; Locale: fi-FI (fi_FI.UTF-8); UI-Language: en-US Calc: threaded Built on 1 December 2019
Yes I see, I missed some of the functions without a standard parameter, I will send a follow up patch tomorrow. Thank you for the test!
Hm, just a stupid question: with your locale (Locale: fi-FI (fi_FI.UTF-8); UI-Language: en-US) this example should return 1 and for an an English locale the example will return 1234. So I think the patch is correct.
(In reply to Andreas Heinisch from comment #13) > Hm, just a stupid question: with your locale (Locale: fi-FI (fi_FI.UTF-8); > UI-Language: en-US) this example should return 1 and for an an English > locale the example will return 1234. So I think the patch is correct. Argh, sorry, you are right. I tested by running with LC_ALL=C and indeed I get "1,234 becomes1234" with en-US locale.
Created attachment 156252 [details] test cInt Please add an entry to the release notes, cause this change can affect existing basic code.
Thank you for the hint! Done: https://wiki.documentfoundation.org/ReleaseNotes/6.5#Base
If this bug was about the cInt() function specifically, why does the commit from comment #10 seem to be much more generic? Please correct me if I am wrong and it indeed is only about changing what cInt() does.
The patch considered all the functions which retrieve their value from ImpScan [1]. It changed all of the functions retrieving any numerical value (cLng, cByte, etc.). So the patch is more generic and not only for cInt. https://opengrok.libreoffice.org/xref/core/basic/source/sbx/sbxint.cxx?r=aef7feb3#707
But the patch also changed what it means, in a decimal-comma and grouping-period locale, to simply assign a string that contains a floating-point number to a variable that has been declared as Integer, even with no function involved. I think that is wrong. Previously, Dim i1 as Integer i1 = "1.2" Made i1 have the value 1. Now it has the value 12. That is IMHO a far too intrusive change and not good.
And if assigning to a variable declared Single, previously it got the value 1.2, and now then 12. This is particularly surprising and counter-intuitive in my humble opinion. It should be only the CInt etc functions that take the locale into consideration.
Created attachment 176442 [details] Another test document Look at the Basic code in this and compare how it works in LibreOffice 6 and 7.
Should we open a new bug report in order to change only the cInt, cLong etc? Or should we reopen this one?
Just for the record: Historic reference: OOo 1.0.3 used decimal dot irrespective to the locale, so it was not like "*originally* in StarOffice was so that plain assignment without any function used locale-specific conversion, then LO changed that to always use decimal-period, and then later that bug was filed and it was changed back" - or at least that then would be even earlier. Documentation reference: StarOffice 8 Programmierhandbuch für BASIC 8 [1] implicitly tells in its "Implicit and Explicit Type Conversions" section, that implicit conversions match explicit CFoo, as opposed to Val function. It is especially obvious checking the next "Checking the Content of Variables" section, where it's clear that IsNumeric and friends complement implicit conversions, like in the code (from the section): If IsNumeric(UserInput) Then ValidInput = UserInput Else ValidInput = 0 MsgBox "Error message." End If Note how this is user input-oriented; and how this would work with input like ".123", where on decimal-comma-using locales, IsNumeric(".123") is False. All in all, having this change in the released versions for some time already, it's very difficult to have a "fix" for the situation - with current implementation being *correct*, and older being de-facto standard for quite some time ... unfortunately. [1] See https://wiki.documentfoundation.org/Macros
The worst thing with the patch was that it also changed the unit test that verified the expected behaviour of implicit conversion from string to number. What is the point in having unit tests that verify some expected semantics if a patch can change those semantics (and break an unknown number of existing documents) as long as it also changes the unit test?
(In reply to Tor Lillqvist from comment #24) > The worst thing with the patch was that it also changed the unit test that > verified the expected behaviour of implicit conversion from string to > number. You are unfair to this commit. The unit test was added by myself in https://git.libreoffice.org/core/+/b24b0b9856b42eb6598c053703a11f86b6a6346c, and was intended to test a different thing. Changing it after inspecting its roots is just fine (but the change itself is indeed breaking...)
As the originator of this report (over 5 years ago) I'm feeling a little guilty for causing so much trouble. Sorry! ;-)
it is also fun that we did have unit tests already since 2013 that were supposed to verify the documented locale-specific behaviour of implicit conversion from strings to numbers (but which in fact did not work as documented). But the test were written in such a way that they actually tested was that implicit conversion and explicit conversion with CDbl() give the same result. Which they did also before this patch... (See basic/qa/basic_coverage/da-DK/cdbl.bas and basic/qa/basic_coverage/da-DK/cdbl-2.bas.) If I understand correctly...
I think discussing changes and even changing the underlying unit tests should be acceptable since a patch may change the current behaviour. This patch was added with the best of knowledge and belief and atm there were no regression reports in bugzilla. I don't know which behaviour should be targeted now, but I am open to any changes which we may decide.
(In reply to Andreas Heinisch from comment #28) In any case, converting "1.23" to 123 is wrong on *any* locale - simply because there's no locale where the group separator can split *two* rightmost decimals (in Indian system, only after separating three rightmost digits, the rest is split into groups of two). So on a decimal-comma thousand-dot locale, CInt("1.23") should bring 1, not 123; and CInt("1.234") should result in 1234. But indeed, it's a separate issue.
(In reply to Mike Kaganski from comment #29) OTOH, VBA seems to simply strip all locale's group separators, wherever they might appear (and even if there are two or more thousand separators in succession, like - for Russian locale using space - CInt"1 2 34 , 6" will give 1235). Please ignore this and the previous comments, they are just noise.
But for every comparison to VBA there will be people who say "why should we care what M$ does, LO Basic is not and has never been supposed to be 100% compatible with VBA". Which is true, of course.
Suggested revert and re-work to affect *only* the CInt, CLng, CByte, CSng, and CDbl functions in https://gerrit.libreoffice.org/c/core/+/125756
Please consider also this: Dim d As Double Dim s As String d = 1.23 s = d d = s MsgBox d Note that prior to the change the result was unexpectedly 1.
It is totally irrelevant whether how BASIC behaved before this change was "unexpected" or not. (To me, most of BASIC is "unexpected" and silly.) What is relevant is that it behaved in a certain way, for tens of years, and users wrote code accordingly. Whether the behaviour matched some old German language documentation that very few had read or not is irrelevant. Now after this change, the same code works differently.
(In reply to Tor Lillqvist from comment #34) I agree wrt breakage :-) What I write here is for the completeness of the whole picture. We have multiple problems in our Basic; and IMO trying to reject any improvements on the basis that it would be a breakage is not constructive per se; we need to allow some breakage potential *if there's a reasonable workaround*, like easy enough fix using compatible functions, if we ever want to allow some progress. We had a problem "similar" to this one (in its spirit at least; IMO at least ;)) - see https://mikekaganski.wordpress.com/2019/11/09/when-legacy-justifies-errors/. We were able to introduce a change, still maintaining backward compatibility. I doubt here we have some similar possibility, though. Tor: of course, it's up to you to do the change. Please don't feel blocked by my opinion.
Tor Lillqvist committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/47aabde053a1472dc32770da29d68c8de5bd7919 Make the tdf#97983 changes to BASIC optional It will be available in 7.4.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Having raised this 5 years ago, a change was made 2 to 3 years later, and then it went quiet until this year. I'm not into all the details, but it seems strange that what I imagined was standard practice in knowing which locale was in use, and therefore how , and . should be treated, has created so many problems. Is it really a BASIC problem? No matter. I'm happy to test something when a formal release comes out. But if it is 'optional' how do I specify what behaviour I expect?
(In reply to tim from comment #37) > Is it really a BASIC problem? Locales are hard. Really hard :) And having an inconsistent implementation of a widely-used function is really problematic. So it is ... well, it is our Basic problem, and compatibility problem, and general difficult topic problem :-) > But if it is 'optional' how do I specify what behaviour I expect? The change is created in such a way, so that the standard behavior is as it was before the change; and there is an expert configuration to enable old (pre-7.0) behavior for people who *really know what they do*. So for all standard uses, this commit does not change behavior.
(In reply to Mike Kaganski from comment #38) > (In reply to tim from comment #37) > > Is it really a BASIC problem? > > Locales are hard. Really hard :) And having an inconsistent implementation > of a widely-used function is really problematic. So it is ... well, it is > our Basic problem, and compatibility problem, and general difficult topic > problem :-) > > > But if it is 'optional' how do I specify what behaviour I expect? > > The change is created in such a way, so that the standard behavior is as it > was before the change; and there is an expert configuration to enable old > (pre-7.0) behavior for people who *really know what they do*. So for all > standard uses, this commit does not change behavior. Thanks. Please forgive me being dim, but which 'standard' behaviour? The behaviour before this latest change, or the behaviour before the change that was made in or around 2019 in response to my original query?
(In reply to tim from comment #39) > which 'standard' behaviour? The behaviour > before this latest change, or the behaviour before the change that was made > in or around 2019 in response to my original query? I used word "standard" to mean "the behavior that user sees after installing, without any special configuration". So my phrase was to explain what the "standard" behavior will be (it will be same as before the patch mentioned in commit 36). (Sorry for use of unclear language :)) To make it unambiguous: 1. Before commit 4fdc90c51e6a1bbb83c1f1826ad5b90dc1ff0ad6 (comment 10, landed in LO 7.0), cInt, cDbl, cFlt, and implicit conversion from string to number did not consider locale. 2. The mentioned commit changed all those to use current locale (and thus all conversions using those functions / implicit conversions became locale-aware, so might break). 3. Commit 47aabde053a1472dc32770da29d68c8de5bd7919 (comment 36) introduced an expert configuration, which is *not* enabled, but which, *when enabled*, makes *implicit conversion* of string into number locale-unaware again - so unless you enable it, the behavior will be as it was after #2; and if you enable it, the behavior will be as before #2 - *only for implicit conversion* (CDbl/CFlt will not be affected by the setting!). Hope it makes sense.
(In reply to Mike Kaganski from comment #40) > (In reply to tim from comment #39) > > which 'standard' behaviour? The behaviour > > before this latest change, or the behaviour before the change that was made > > in or around 2019 in response to my original query? > > I used word "standard" to mean "the behavior that user sees after > installing, without any special configuration". So my phrase was to explain > what the "standard" behavior will be (it will be same as before the patch > mentioned in commit 36). (Sorry for use of unclear language :)) > > To make it unambiguous: > > 1. Before commit 4fdc90c51e6a1bbb83c1f1826ad5b90dc1ff0ad6 (comment 10, > landed in LO 7.0), cInt, cDbl, cFlt, and implicit conversion from string to > number did not consider locale. > > 2. The mentioned commit changed all those to use current locale (and thus > all conversions using those functions / implicit conversions became > locale-aware, so might break). > > 3. Commit 47aabde053a1472dc32770da29d68c8de5bd7919 (comment 36) introduced > an expert configuration, which is *not* enabled, but which, *when enabled*, > makes *implicit conversion* of string into number locale-unaware again - so > unless you enable it, the behavior will be as it was after #2; and if you > enable it, the behavior will be as before #2 - *only for implicit > conversion* (CDbl/CFlt will not be affected by the setting!). > > Hope it makes sense. Thanks very much - all clear!