Bug 39316 - Data corruption in array formulas
Summary: Data corruption in array formulas
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
3.4.1 release
Hardware: Other All
: medium major
Assignee: Not Assigned
URL:
Whiteboard: target:4.5.0 4.4.1
Keywords:
Depends on:
Blocks:
 
Reported: 2011-07-17 17:32 UTC by James
Modified: 2015-12-22 01:32 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
sample file to show the reported bug (7.48 KB, application/vnd.oasis.opendocument.spreadsheet)
2011-07-17 17:32 UTC, James
Details
demonstration of array formula corruption (96.14 KB, application/vnd.oasis.opendocument.spreadsheet)
2011-07-26 17:47 UTC, James
Details
VLOOKUP and MATCH fail (81.40 KB, application/vnd.oasis.opendocument.spreadsheet)
2011-07-26 20:53 UTC, James
Details

Note You need to log in before you can comment on or make changes to this bug.
Description James 2011-07-17 17:32:50 UTC
Created attachment 49229 [details]
sample file to show the reported bug

Bring up the sample file. On sheet 1, the 4 areas of interest are:

A1:B19 = The named range TestRange
C1:D19 = The array formula {=TestRange}
E1:F19 = The array formula {=ISBLANK(TestRange)}
G1:H19 = The array formula {=ISBLANK(C1:D19)}

You will see the bug immediately: cell G1 shows FALSE, when it should show TRUE. This same bug will also cause formulas to fail. Cell I1 contains the formula "=C1+5", which returns the error "Wrong data type".
Comment 1 Jeffrey 2011-07-17 18:31:17 UTC
Reproduced on LibreOffice 3.4  340m1(Build:103) for OpenSuse Linux.
Comment 2 James 2011-07-21 08:55:09 UTC
I wanted to add that this bug isn't really about ISBLANK. ISBLANK is just the first function I noticed the bug with. This bug is really about how array formulas mangle empty cells, if the cell is the first one in the range.

I wanted to add additional data points regarding the offending cell.

SUM(<offending cell>,5) correctly returns 5
=<offending cell>+5 gives a "Wrong data type" error
PRODUCT(<offending cell>,5) incorrectly returns 5!?!?
=<offending cell>*5 gives a "Wrong data type" error
ISTEXT() incorrectly returns TRUE
ISEVEN() incorrectly returns error 504 (see ISTEXT for why). It should return TRUE.
ISODD() has the same problem as ISEVEN().
AND() and OR() both incorrectly give a "Wrong data type" error. They should return FALSE.
ABS() gives "Wrong data type'. It should return 0.

Why do SUM() and PRODUCT() work differently than '+' and '*'?

It is extremely odd that SUM() treats the offending cell as 0 and PRODUCT() treats it as 1. It's also odd that both treat the offending cell as a number, rather than giving "Wrong data type" errors.
Comment 3 James 2011-07-26 17:45:12 UTC
I have another sample of array formula corruption to show you, which may be related to the original test case for this bug. I know the attached sheet is complex, but the obvious stripped down sheet doesn't seem to show the problem. I have attached the file "Corruption 2.ods" as the sample.

What I'm doing is using ranges joined with the ~ operator, like so:

Range1~Range2~Range3

using the multiple range mode of INDEX:

INDEX((Range1~Range2~Range3),0,0,selector)

I use the range/column returned by INDEX() as the key in an array VLOOKUP:

VLOOKUP(INDEX(...),Table,2,0)

Each row is a key. Finally, the numbers returned are all summed together in a complete array formula:

={SUM(IF(ISNA(VLOOKUP(...)),0,VLOOKUP(...)))}

If LibreOffice had IFNA that formula wouldn't be so hideous to read, not to mention the redundant and wasteful double-evaluation of VLOOKUP, but I digress.

On to the sheet. It should start on the sheet titled Character Data. Two cells are of interest, R2C21 and R14C26. They are notable for two things: having the exact same aforementioned array formula in them, and returning different (!) results. Just so you know, the result (3/4) displayed in R14C26 is the correct one.

As indicated by the contents of R2C15 (used by the formula), we are looking up a column in range #4 passed to INDEX. You can see the contents of that column on sheet "Powers", near the column headed "Power 4" (R1C10). It has two non-empty rows, Linked and Visible. Each row of Linked is worth 1/2, each Visible 1/4, hence the correct sum of 3/4.

These numbers are looked up from LimitationsTable, which is columns 25 and 26 of sheet "Computed". Columns25/26 themselves contain an array formula using the multiple range mode of INDEX. You can see the raw data on sheet Edition. Just search for Linked and you will see there are 3 mostly-redundant columns of data, any one of which could be the data returned in LimitationsTable. In the sheet as configured, it happens to be the first one. If you want to change which of the 3 is used for the lookup, change the "4th", on sheet Character Data, to either 5th or 6th. It's amusing to watch the fireworks for 1/2 a second.

One more thing: this spreadsheet sometimes triggers a nasty save-corruption bug (39485), so keep an original handy.
Comment 4 James 2011-07-26 17:47:06 UTC
Created attachment 49592 [details]
demonstration of array formula corruption
Comment 5 James 2011-07-26 20:53:25 UTC
Created attachment 49594 [details]
VLOOKUP and MATCH fail
Comment 6 James 2011-07-26 21:08:42 UTC
I've attached another sample, this time of VLOOKUP and MATCH failing. The spreadsheet is not visibly corrupt; all the data is verifiably there and where it's supposed to be.

For the cells of interest: look at R21 and R22, C27, sheet Character Data  They are, respectively, a VLOOKUP and MATCH on the same table using the same key. If you go look at the table (columns 27 and 28 of sheet Computed), you can see that the key is in the very first row. I checked that are no hidden spaces or nonprinting characters that would throw things off.

The  #N/A columns you see to the left are manifestations of the same problem.
Comment 7 Björn Michaelsen 2011-12-23 12:27:28 UTC
[This is an automated message.]
This bug was filed before the changes to Bugzilla on 2011-10-16. Thus it
started right out as NEW without ever being explicitly confirmed. The bug is
changed to state NEEDINFO for this reason. To move this bug from NEEDINFO back
to NEW please check if the bug still persists with the 3.5.0 beta1 or beta2 prereleases.
Details on how to test the 3.5.0 beta1 can be found at:
http://wiki.documentfoundation.org/QA/BugHunting_Session_3.5.0.-1

more detail on this bulk operation: http://nabble.documentfoundation.org/RFC-Operation-Spamzilla-tp3607474p3607474.html
Comment 8 James 2011-12-24 18:31:12 UTC
I have confirmed that this bug persists in LibreOffice 3.5 Beta 2 for Mac (OSX 10.6.8).
Comment 9 Roman Eisele 2012-09-20 10:46:18 UTC
(In reply to comment #1)
> Reproduced on LibreOffice 3.4  340m1(Build:103) for OpenSuse Linux.

This means that the Platform can not be just “Mac OS”, therefore I change it to “All” as appropriate (even if this bug was not reproducible on Windows, we need to use “All” if more than one Platform is affected).
Comment 10 QA Administrators 2015-01-05 17:51:46 UTC
** Please read this message in its entirety before responding **

To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year.

There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present.

If you have time, please do the following:

Test to see if the bug is still present on a currently supported version of LibreOffice (4.3.5 or later): https://www.libreoffice.org/download/

If the bug is present, please leave a comment that includes the version of LibreOffice and your operating system, and any changes you see in the bug behavior

If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a short comment that includes your version of LibreOffice and Operating System

Please DO NOT

Update the version field
Reply via email (please reply directly on the bug tracker)
Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case)

Thank you for your help!

-- The LibreOffice QA Team
Comment 11 Buovjaga 2015-01-23 16:34:34 UTC
Confirmed.

Win 7 Pro 64-bit Version: 4.5.0.0.alpha0+
Build ID: 07e84cae983c08afdba03018413a19d01abb3006
TinderBox: Win-x86@62-TDF, Branch:MASTER, Time: 2015-01-19_06:15:38
Comment 12 Eike Rathke 2015-02-11 15:06:51 UTC
(In reply to James from comment #0)
> A1:B19 = The named range TestRange
> C1:D19 = The array formula {=TestRange}
> E1:F19 = The array formula {=ISBLANK(TestRange)}
> G1:H19 = The array formula {=ISBLANK(C1:D19)}
> 
> You will see the bug immediately: cell G1 shows FALSE, when it should show
> TRUE.

Not true. ISBLANK() returns true only if a cell is blank, does not contain data nor a formula, not if it displays as blank.

> This same bug will also cause formulas to fail. Cell I1 contains the
> formula "=C1+5", which returns the error "Wrong data type".

Which is only indirectly related. Here it seems the operator '+' can't cope with the empty matrix element.


@James:
It would had been helpful to not mix various cases of whatsoever you noticed, be it related to arrays or not. Regarding SUM and PRODUCT, they ignore non-value cells by definition and thus may produce results where operator '+' and '*' may produce an error. And of course PRODUCT uses the start value 1 for its series. All major spreadsheet applications do it this way.

@Beluga:
Does your "confirmed" mean you tested all what was mentioned here, or just the original problem?
Comment 13 Buovjaga 2015-02-11 17:46:58 UTC
(In reply to Eike Rathke from comment #12)
> @Beluga:
> Does your "confirmed" mean you tested all what was mentioned here, or just
> the original problem?

The problem in the description.
Comment 14 Eike Rathke 2015-02-11 22:28:16 UTC
(In reply to James from comment #3)
> On to the sheet. It should start on the sheet titled Character Data. Two
> cells are of interest, R2C21 and R14C26. They are notable for two things:
> having the exact same aforementioned array formula in them, and returning
> different (!) results. Just so you know, the result (3/4) displayed in
> R14C26 is the correct one.

This error is not reproducible in 4.4.0, 4.3.6 or 4.2.8, just that in 4.2.8 a hard recalculation (Shift+Ctrl+F9) is needed to display the correct 0.75 value, without the values are 0.

Transcribing from R1C1 into A1 address syntax: R2C21 is U2, R14C26 is Z14
Comment 15 Eike Rathke 2015-02-11 22:42:01 UTC
(In reply to James from comment #6)
> For the cells of interest: look at R21 and R22, C27, sheet Character Data 
> They are, respectively, a VLOOKUP and MATCH on the same table using the same
> key.

Assuming that you mean cells AA21 and AA22 instead of R21 and R22 I see no error in 4.4.0 or 4.3.6 or 4.2.8

AA21 is 0 as it should be (content of Computed.AB1 as "No Adder" was found in Computed.AA1) and AA22 is 1 as it should be (1st row in search range)
Comment 16 Commit Notification 2015-02-11 22:48:47 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=cf3f1d8dfeb45249eb60a30dba3243fe9a4a65e5

add ScMatrix::IsEmptyResult() for tdf#39316 fix preparation

It will be available in 4.5.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 17 Commit Notification 2015-02-11 22:48:56 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=eccbc97c7c224269fe261b8924e7866c3758ec91

Resolves: tdf#39316 add matrix empty cells to ScInterpreter::QueryMatrixType()

It will be available in 4.5.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 18 Eike Rathke 2015-02-11 23:44:08 UTC
Pending review
https://gerrit.libreoffice.org/14428 for 4-4
https://gerrit.libreoffice.org/14429 for 4-4-1
https://gerrit.libreoffice.org/14430 for 4-3
Comment 19 Robinson Tryon (qubit) 2015-12-22 01:32:58 UTC
Removing comma from Whiteboard (please use a space to delimit values in this field)
https://wiki.documentfoundation.org/QA/Bugzilla/Fields/Whiteboard#Getting_Started
[NinjaEdit]