Bug 142460 - Type of result of comparison of literals in Basic must be Boolean, not Integer
Summary: Type of result of comparison of literals in Basic must be Boolean, not Integer
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: BASIC (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium normal
Assignee: Baltasar
URL:
Whiteboard: target:7.3.0 target:7.2.0.0.beta2
Keywords: difficultyBeginner, easyHack, skillCpp, topicDebug
Depends on:
Blocks:
 
Reported: 2021-05-24 11:33 UTC by Mike Kaganski
Modified: 2021-09-07 08:04 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Kaganski 2021-05-24 11:33:14 UTC
This basic code:

  msgbox TypeName(1<2)
  msgbox TypeName("A"<"Z")
  A = "A"
  Z = "Z"
  msgbox TypeName(A < Z)

returns "Integer" in the first two case, and "Boolean" in the last one. The result type for comparison, when using literals, must be Boolean, as when comparing variables.

Code pointer: same as in tdf#142180 - see SbiExprNode::FoldConstantsBinaryNode in basic/source/comp/exprnode.cxx.

The fix must include a unit test.
Please CC me to your gerrit change, so that I could help and review.
Comment 1 Xisco Faulí 2021-06-09 07:33:16 UTC
Moving to NEW
Comment 2 Baltasar 2021-06-16 13:21:22 UTC
Could I take over this matter?
Comment 3 Baltasar 2021-06-16 13:21:51 UTC
Well, I've assigned this to myself. Hope there is no problem.
Comment 4 Mike Kaganski 2021-06-16 15:46:09 UTC
(In reply to Baltasar from comment #2)
> Could I take over this matter?

Yes, of course!

(In reply to Baltasar from comment #3)
> Well, I've assigned this to myself. Hope there is no problem.

Thank you for interest in this! Looking forward to see your contribution in gerrit :-)
Comment 5 Baltasar 2021-06-18 08:23:59 UTC
I'm following the code, and I think I'm starting to understand it. However, there is a matter that keeps nagging me all the time: eNodeType. It seems, from its definition, that the appropriate value is SbxNUMVAL. However, this does not fit very well with type SbxBOOL (or I think so). IS there a better fit?
Comment 6 Mike Kaganski 2021-06-18 08:27:48 UTC
(In reply to Baltasar from comment #5)

Please describe, if you have found the type of eNodeType, and the possible values of it. Also have you read the comments to the possible values (enumeration constants), and what they mean.
Comment 7 Baltasar 2021-06-18 09:48:08 UTC
Well, ok.

exprh.hh #71:

enum SbiNodeType {
    SbxNUMVAL,                      // nVal = value
    SbxSTRVAL,                      // aStrVal = value, before #i59791/#i45570: nStringId = value
    SbxVARVAL,                      // aVar = value
    SbxTYPEOF,                      // TypeOf ObjExpr Is Type
    SbxNODE,                        // Node
    SbxNEW,                         // new <type> expression
    SbxDUMMY
};

It seems that SbxNUMVAL is the chosen option, because there are actually only a few relevant value types: STR and NUM.

I suppose that the fact that the expression is currently typed as integer and not as boolean has to do with the type bool being a construction over integer (i.e., 0 == false, !0 == true).
Comment 8 Mike Kaganski 2021-06-18 09:53:45 UTC
(In reply to Baltasar from comment #7)
> enum SbiNodeType {
>     SbxNUMVAL,                      // nVal = value
>     SbxSTRVAL,                      // aStrVal = value, before
> #i59791/#i45570: nStringId = value
>     SbxVARVAL,                      // aVar = value
>     SbxTYPEOF,                      // TypeOf ObjExpr Is Type
>     SbxNODE,                        // Node
>     SbxNEW,                         // new <type> expression
>     SbxDUMMY
> };

Please note, that the semantics of *this* enum, as shown in the comments, is just to allow using correct variables: when eNodeType is SbxNUMVAL, we know to use nVal to read/write value; when it is SbxSTRVAL, we use aStrVal, etc.

> It seems that SbxNUMVAL is the chosen option, because there are actually
> only a few relevant value types: STR and NUM.

Right.

> I suppose that the fact that the expression is currently typed as integer
> and not as boolean has to do with the type bool being a construction over
> integer (i.e., 0 == false, !0 == true).

And this is not correct. Indeed, we use numeric constants for true and false; but that doesn't prevent us from declaring the Basic type to be boolean - and it is controlled by *another* variable, not eNodeType.
Comment 9 Baltasar 2021-06-22 10:08:37 UTC
Hi,

I've finished the task. Now the program

===

Sub Main
  msgbox TypeName(1<2)
  msgbox 1<2
  
  msgbox TypeName(2<1)
  msgbox 2<1
  
  msgbox TypeName(1.0<2.0)
  msgbox 1.0<2.0
  
  msgbox TypeName(2.0<1.0)
  msgbox 2.0<1.0  
  
  msgbox TypeName("A"<"Z")
  msgbox "A"<"Z"
  
  msgbox TypeName("Z"<"A")
  msgbox "Z"<"A"
  
  A = "A"
  Z = "Z"
  msgbox TypeName(A < Z)
  msgbox A<Z
  
  msgbox TypeName(Z < A)
  msgbox Z<A  
End Sub
===

Works correctly, showing "Boolean" and alternate "True"/"False".

I had to "touch" quote a lot more of source code than I expected, and would like to know whether this is acceptable.

First of all, I changed the lines at the code pointer given, changing "SbxINTEGER" for "SbxBOOL", and so on. This was straightforward and I think it was expected. For instance:

===
eType = SbxBOOL;
            eNodeType = SbxNUMVAL;
            int eRes = rr.compareTo( rl );
            switch( eTok )
            {
            case EQ:
                nVal = ( eRes == 0 ) ? SbxTRUE : SbxFALSE;
                break;
===

The problem I was referring to with eNodeType being NUMVAL, is that the whole Basic is expecting a string, or a double (which will be converted to integer when needed), and little else. No place for a completely separated type like a boolean, at least not without a lot of code changed.

After exprnode.cxx changed... nothing actually changed after executing the code, the type was Integer, and the results were 1 (not Boolean/True).

The first step was symbtbl.cxx, this code was crashing with a false assert:

===
short SbiStringPool::Add( double n, SbxDataType t )
{
    char buf[40]{};
    switch( t )
    {
        // tdf#131296 - store numeric value including its type character
        // See GetSuffixType in basic/source/comp/scanner.cxx for type characters
        case SbxINTEGER: snprintf( buf, sizeof(buf), "%d%%", static_cast<short>(n) ); break;
        case SbxLONG:
            snprintf( buf, sizeof(buf), "%" SAL_PRIdINT32 "&", static_cast<sal_Int32>(n) ); break;
        case SbxSINGLE:  snprintf( buf, sizeof(buf), "%.6g!", static_cast<float>(n) ); break;
        case SbxDOUBLE:  snprintf( buf, sizeof(buf), "%.16g", n ); break; // default processing in SbiRuntime::StepLOADNC - no type character
        case SbxCURRENCY: snprintf(buf, sizeof(buf), "%.16g@", n); break;
        default: assert(false); break; // should not happen
    }
    return Add( OUString::createFromAscii( buf ) );
}
===

The possibility of a boolean type was not taken into account here; again, there is an Add() for strings and an Add() for numeric values. At first, I tried to write "true" or "false", according to the numeric value. But the problem is that the whole process does not stop here. After building the AST in the parser, a set of opcodes is generated, and then these are executed. Probably the matter would've been simpler if the AST was directly executed. So what did I do? I invented another suffix, 'b' (which could be changed), and added another case to the switch:

case SbxBOOL: snprintf( buf, sizeof(buf), "%db", static_cast<short>(n) ); break;

When the opcodes are generated, and in a previous phase to the suffix conclusion, I had another problem here:

void SbiExprNode::Gen( SbiCodeGen& rGen, RecursiveMode eRecMode )
{
    sal_uInt16 nStringId;

    if( IsConstant() )
    {
        switch( GetType() )
        {
        case SbxEMPTY:
            rGen.Gen( SbiOpcode::EMPTY_ );
            break;
        case SbxSTRING:
            nStringId = rGen.GetParser()->aGblStrings.Add( aStrVal );
            rGen.Gen( SbiOpcode::SCONST_, nStringId );
            break;
        default:
            // tdf#131296 - generate SbiOpcode::NUMBER_ instead of SbiOpcode::CONST_
            // for SbxINTEGER and SbxLONG including their numeric value and its data type,
            // which will be restored in SbiRuntime::StepLOADNC.
            nStringId = rGen.GetParser()->aGblStrings.Add( nVal, eType );
            rGen.Gen( SbiOpcode::NUMBER_, nStringId );
            break;
        }
    }

Again, generate a string value or a numeric value, which is really limiting. Finally I decided to change nothing here (after the suffix conclusion), and go instead to SbiRuntime::StepLoadNC.

Basically, again a string or a numeric value are considered, and inside the numeric values, giving the termination, various types are inferred. Here I took the idea of adding (yet another) suffix. This is the final form:

===
void SbiRuntime::StepLOADNC( sal_uInt32 nOp1 )
{
    // #57844 use localized function
    OUString aStr = pImg->GetString( static_cast<short>( nOp1 ) );
    // also allow , !!!
    sal_Int32 iComma = aStr.indexOf(',');
    if( iComma >= 0 )
    {
        aStr = aStr.replaceAt(iComma, 1, ".");
    }

    sal_Int32 nParseEnd = 0;
    rtl_math_ConversionStatus eStatus = rtl_math_ConversionStatus_Ok;
    double n = ::rtl::math::stringToDouble( aStr, '.', ',', &eStatus, &nParseEnd );

    // tdf#131296 - retrieve data type put in SbiExprNode::Gen
    SbxDataType eType = SbxDOUBLE;

    if ( nParseEnd < aStr.getLength() )
    {
        switch ( aStr[nParseEnd] )
        {
            // See GetSuffixType in basic/source/comp/scanner.cxx for type characters
            case '%': eType = SbxINTEGER; break;
            case '&': eType = SbxLONG; break;
            case '!': eType = SbxSINGLE; break;
            case '@': eType = SbxCURRENCY; break;
            case 'b': eType = SbxBOOL; break;
        }
    }

    SbxVariable *p = new SbxVariable( eType );

    if ( eType == SbxBOOL ) {
        p->PutBool( n );
    } else {
        p->PutDouble( n );
    }

    // tdf#133913 - create variable with Variant/Type in order to prevent type conversion errors
    p->ResetFlag( SbxFlagBits::Fixed );
    PushVar( p );
}
===

The question is: is this acceptable? It works, and it works without deep changes in the basic's code. However, it feels to me more a hack than a fix...
Comment 10 Mike Kaganski 2021-06-22 10:52:06 UTC
(In reply to Baltasar from comment #9)

It's very nice to see your progress. Could you please post your results as a gerrit change, as described in https://wiki.documentfoundation.org/Development/gerrit ? There we could discuss the changes in detail - Bugzilla is not a place to discuss specific code changes. I just want to tell that it was expected that you only start at the code pointer given, and solve the resulting problems - possibly with a help from mentors answering specific questions :) Without looking deep into the code you pasted (it would be much more comfortable to do that in gerrit!), I can tell that I don't see something fundamentally broken in your description :) - thank you for looking at this!
Comment 11 Baltasar 2021-06-22 10:54:09 UTC
Sure! Glad this is not considered a hack, then. I'll probably have to pull and then upload by gerrit.

Thanks.
Comment 12 Commit Notification 2021-06-25 14:27:14 UTC
baltasarq committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/5eedb3beeaeed88de0d1ebd041a9f15ceea7e78c

tdf#142460: properly handle boolean values in string pool

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 13 Commit Notification 2021-06-28 08:09:51 UTC
baltasarq committed a patch related to this issue.
It has been pushed to "libreoffice-7-2":

https://git.libreoffice.org/core/commit/e7c5b2412446b4a5158b35745742389186856984

tdf#142460: properly handle boolean values in string pool

It will be available in 7.2.0.0.beta2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 14 Andreas Heinisch 2021-09-07 08:04:50 UTC
Fixed in:

Version: 7.3.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: 0aea0cee58fe77a9058217dfdfc3d6a02b29ee2a
CPU threads: 6; OS: Windows 10.0 Build 19042; UI render: Skia/Raster; VCL: win
Locale: en-US (de_DE); UI: en-US
Calc: CL