Bug Hunting Session
Bug 98481 - SUM is zero when some cells are empty and some cells reference other sheet AND file is .xlsx AND file was created by Google Sheets.
Summary: SUM is zero when some cells are empty and some cells reference other sheet AN...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
5.1.0.2 rc
Hardware: All All
: medium normal
Assignee: Eike Rathke
URL:
Whiteboard: target:5.3.0 target:5.2.4
Keywords: filter:xlsx
Depends on:
Blocks:
 
Reported: 2016-03-07 03:43 UTC by Flint O'Brien
Modified: 2016-10-27 16:06 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
File with bug created with Google Docs and downloaded as xlsx. (5.27 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2016-03-07 03:43 UTC, Flint O'Brien
Details
Same as first attachment - created with Google Sheets and downloaded as ods. (5.27 KB, application/vnd.oasis.opendocument.spreadsheet)
2016-03-07 03:47 UTC, Flint O'Brien
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Flint O'Brien 2016-03-07 03:43:40 UTC
Created attachment 123362 [details]
File with bug created with Google Docs and downloaded as xlsx.

On Sheet1, reference two cells from 'Sheet 2'. Cells B2:B4 display values: [1][ ][3]
On Sheet1, =SUM(B2:B4) displays zero when it should be 4.

SUM() works if the B2, B3, B4 all have values (even if each value is a reference to another sheet).
SUM() works if the B2 and B4 are simple numeric values (not references to another sheet).

I created the spreadsheet with Google Docs Sheets. Downloaded it as .xlsx and .ods. The xlsx has the bug, but the .ods version behaves as expected. The bug remains even if I edit the cells in LibreOffice.

I've attached the test file that has the bug.

LibreOffice
Version: 5.0.1.2
Build ID: 81898c9f5c0d43f3473ba111d7b351050be20261
Locale: en-US (en.UTF-8)

Mac OS X 10.11.3
Comment 1 Flint O'Brien 2016-03-07 03:47:48 UTC
Created attachment 123363 [details]
Same as first attachment - created with Google Sheets and downloaded as ods.

This is the .ods version that behaves as expected. The .xlsx version reproduces the bug.
Comment 2 raal 2016-03-08 21:55:09 UTC
Hard recalc CTRL+SHIFT+F9 works for me. Recalculate you can turn in Menu/ptions/Calc/Formula/Recalculation on file load
Comment 3 Buovjaga 2016-03-09 19:59:34 UTC
Can you comment on raal's suggestion?
Comment 4 Xisco Faulí 2016-10-10 11:25:02 UTC Comment hidden (obsolete)
Comment 5 Flint O'Brien 2016-10-11 02:43:40 UTC
The model I have in my mind is that all values for all formulas are always correct. No exceptions. I don't understand why recalculate-on-load is an option (and off by default).

I changed the preference to recalculate on file load and it does calculate correctly when the file is loaded and it properly calculates equations entered after the file is loaded. If I turn this option off (which is the default), the problem fields are not calculated correctly on load and adding new expressions are also not calculated correctly. As mentioned in the original bug report, this is for expressions referencing other sheets and expressions on the current sheet that reference cells on another sheet.


Regarding the previous comment to just throw this bug out...
Let met start with saying that I do appreciate all the work everyone is putting in on this project.

I don't know why I have to provide this information since it's not necessary to recreate this bug. The bug exists using the default options of a clean install. I'm disheartened if you're not willing to fix bugs that have been thoroughly reported with a complete and minimal example. A spreadsheet program that you can't trust to do proper calculations 100% of the time is not very useful.

The reason I reported the bug was to try to help the community for people who are unsuspecting and have no reason to turn on recalculate-on-load. With a large spreadsheet, it's not always obvious the numbers aren't being calculated correctly. I really trusted the values to be correct and was quite surprised to find this bug (and doubly surprised it hasn't been high priority to be fix).
Comment 6 Aron Budea 2016-10-12 07:37:27 UTC
The value seems to be cached in the file, eg.:

<c r="E2" t="str" s="6">
  <f t="shared" si="1" ref="E2:E4">sum(B2:D2)</f>
  <v>4</v>
</c>
Comment 7 Buovjaga 2016-10-12 10:00:34 UTC
(In reply to Flint O'Brien from comment #5)
> install. I'm disheartened if you're not willing to fix bugs that have been
> thoroughly reported with a complete and minimal example. A spreadsheet

We are not some collective consciousness that excludes you, by the way. The automated comment was sent, because you ignored my question for 6 months.

The people in this report are first responders, who help the developers.

Let's set to NEW.
Comment 8 Eike Rathke 2016-10-14 19:01:30 UTC
The difference seems rather to be that formulas in E2:E4 and B5:D5 are shared, whereas the formula in E8 is not. The .xlsx file loaded and re-saved in Excel works fine in LinreOffice. Apparently Google Sheets does something Excel does not, which LibreOffice is not prepared for.
Comment 9 Eike Rathke 2016-10-14 19:49:13 UTC
And actually this is it: Google Sheets attributes formula cells with t="str", which means "Cell containing a formula string.", which is correct but Excel does not (at least not 2010). Removing those attributes from the xl/worksheets/sheet1.xml stream makes LibreOffice load the document correctly.
Comment 10 Eike Rathke 2016-10-14 21:56:45 UTC
Great fun .. :-/
Google writes t="str" for formula content.
Excel writes t="str" for formula string results, which we consider during import and set a string result at the formula cell, which of course is ignored in SUM(range), hence that result is 0.
Comment 11 Eike Rathke 2016-10-15 00:32:37 UTC
So, actually Google got this wrong, the 't' attribute indicates the cell's data type, which is of the cell value (<v> element), not the formula string (that is the <f> element) but the formula result.

See ECMA-376
18.3.1.4 c (Cell)
18.18.11 ST_CellType (Cell Type)

Excel uses t="str" for a formula result string, that is not a formula string ... probably because t="s" could not be used because the result string is not a shared string, and with t="inlineStr" the <v> element could not be used for the result but an <is> element instead but formulas can't return rich strings ...

That's totally f*cked up.
Comment 12 Commit Notification 2016-10-15 14:51:41 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=2a45b26c619d1125419819d501a7653cf8d6e5ca

Resolves: tdf#98481 Google Sheets uses OOXML t="str" differently

It will be available in 5.3.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 13 Eike Rathke 2016-10-15 15:12:35 UTC
Pending review https://gerrit.libreoffice.org/29903 for 5-2
Comment 14 Commit Notification 2016-10-27 16:06:19 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "libreoffice-5-2":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=dac45f154a72e1bb35d7b0540ab5723390c11b30&h=libreoffice-5-2

Resolves: tdf#98481 Google Sheets uses OOXML t="str" differently

It will be available in 5.2.4.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.