Bug 87834 - strange behavior of mid()-function
Summary: strange behavior of mid()-function
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: BASIC (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: Other All
: medium normal
Assignee: Julien Nabet
URL:
Whiteboard: target:4.5.0 target:4.4.2
Keywords:
Depends on:
Blocks:
 
Reported: 2014-12-29 17:02 UTC by Michael Büssow
Modified: 2015-02-12 11:59 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Büssow 2014-12-29 17:02:44 UTC
Hello,

this is my first post - please be gentle if I do something wrong.
During programming I found that the mid()-function changes its bevahior. In LO 3 mid(Text T, Start) results an error if Start=0 and an empty string if Start is negative or Start is larger then the length of the text T.
Now, tested in LO 4.2 and 4.3, mid(Text T, Start) results "!!br0ken!!" if Start is negative or if Start is larger then the length of the string +1.
Examples:
mid("test",4) -> "t"
mid("test",5) -> ""
mid("test",6) -> "!!br0ken!!"
mid("test",-1) -> "!!br0ken!!"
mid("test",0) -> error
In the "Open Document Format for Office Applications (OpenDocument) Version 1.2" I found that the "normal" behavior should be:
Returns the characters from T, starting at character position Start, for up to Length characters. For the integer conversions, Start=INT(Start), and Length=INT(Length). If there are less than Length characters starting at start, it returns as many characters as it can beginning with Start. In particular, if Start > LEN(T), it returns the empty string (""). If Start < 0, it returns an Error. If Start >=0, and Length=0, it returns the empty string. Note that MID(T;1;Length) produces the same results as LEFT(T;Length).
Is the "new" behavior intended or is it a bug?
Comment 1 Robinson Tryon (qubit) 2014-12-29 22:18:22 UTC
CC'ing Noel for advice on BASIC behavior.
Comment 2 raal 2014-12-30 07:50:43 UTC
Duplicate of bug 86386?
Comment 3 Michael Büssow 2014-12-30 10:17:29 UTC
(In reply to raal from comment #2)
> Duplicate of bug 86386?

Hello,

I think it's similar to bug 86386. But it seems to me that there the focus is on negative length and to expand the functionality of the mid().
Bug 73771 (on-standard behaviour for Function Mid) shows the same as I intended. There is said, that it is fixed within LibO 4.1.5, but it isn't - or am I wrong?
Comment 4 raal 2014-12-30 12:19:00 UTC
Julien,
could you look at this? You've fixed bug 73771. Thanks
Comment 5 Julien Nabet 2014-12-30 12:38:21 UTC
Will take a look.
Comment 6 Michael Büssow 2014-12-30 13:25:06 UTC
I took a look at http://cgit.freedesktop.org/libreoffice/core/tree/basic/source/runtime/methods.cxx
and I believe the copy()-method is responsible for this behavior.
Maybe a additional line with
if (nStartPos<0 || nStartPos>aArgStr.getLength())
{
nStartPos=aArgStr.getLength()+1
}
solves the problem.
Comment 7 Michael Büssow 2014-12-30 13:27:13 UTC
Sorry I forgot the line number: between 1211 and 1212
Comment 8 Julien Nabet 2014-12-30 17:03:21 UTC
Michael: would you have Excel to compare the different result of mid tests?
Indeed, I don't know if we must only follow ODF or if we must also take into account Excel (since VB is from Microsoft)

Eike: since you had also review my patch and work on Calc thought you might be interested in this bugtracker.

(just for ref: I had also put some unittests here)
Comment 9 Julien Nabet 2014-12-30 17:30:07 UTC
It seems there are several cases to take into account:
- mid with 2 args (I didn't know it existed, since I just had used "mid" in function, not in macro)
- mid with 3 args
- mid$
- Excel compatibility
- bWrite parameter that I don't understand
Badfully, I'm not enough "fluent" in Calc/VB to fix this.
Comment 10 Michael Büssow 2014-12-30 18:11:27 UTC
Julian: I just tested the mid()-function under MS Excel (Version 2003; but the vb description shows the same):
In VB mid(text, start [,count]) there are the following cases:
1) if count is <0 then error
2) if start is <1 then error
3) if count=0 or start>length(text) then "" (empty string)
4) if start+count>length(text) then as if count isn't used
In my eyes the problem in Calc is the result string "!!br0ken!!". So I would prefer (additional to the current state):
A1) if start>length(text) then "" (as in Calc version 3.x)
A2) if start<1 then error message (instead of "only" if start=0)

In the last case (start<1) we can discuss, if it would make more sense to set start=1 or set the result to "" to avoid any error message.

(The case with start=0 is a little bit confusing in ODF. There should be no error, but what should be the result, because the counting of letter begins with 1 as fas as I know ...)
Comment 11 Michael Büssow 2014-12-30 19:50:54 UTC
I just tested furthermore a little bit.
mid("test",7,1) results "" (as expected)
but
mid("test",7) results "!!br0ken!!" instead of ""
and 
mid("test",-7,1) results "!!br0ken!!" instead of an error message

It seems that the one problem (start>length) only occurs if mid() is used with only 2 parameters

mid() can be used with 2 or 3 parameters as function
or with up to 4 parameters as command (and I believe it is all inside the RTLFUNC(Mid))
mid(text,start [,count]) as function or command results a substring
mid(text,start [,count], newtext) as command results a string where in the string text the substring of count letters, beginning from start is substitute by newstring
e. g. 
t="the last year"
mid t,5,4,"new"
msgbox (t)
results a messagebox with "the new year"
Comment 12 Julien Nabet 2014-12-30 20:56:12 UTC
I submitted this patch for review:
https://gerrit.libreoffice.org/13707
Comment 13 Michael Büssow 2014-12-30 22:14:42 UTC
Great job!
I'm fascinated and i believe it is a real improvement. Tank you, Julian.
Comment 14 Commit Notification 2015-01-23 20:34:30 UTC
Julien Nabet committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=b7d02d7a5a374da3e01e0abc4022fba35daa1840

Resolves fdo#87834: strange behavior of mid()-function

It will be available in 4.5.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 15 Commit Notification 2015-01-23 20:34:33 UTC
Julien Nabet committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=4e00df4f0b05b3d6c8f185e08499d40964e7182f

Related fdo#87834: add 2 unit tests

It will be available in 4.5.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 16 Commit Notification 2015-02-12 11:59:26 UTC
Julien Nabet committed a patch related to this issue.
It has been pushed to "libreoffice-4-4":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=82a505aae7fdb532e250b6acce1d33431cedfa88&h=libreoffice-4-4

Resolves fdo#87834: strange behavior of mid()-function

It will be available in 4.4.2.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.