Bug 54898 - Sum error with hours in PIVOTTABLE
Summary: Sum error with hours in PIVOTTABLE
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
3.6.0.2 rc
Hardware: Other All
: high major
Assignee: Kohei Yoshida
URL:
Whiteboard: BSA target:3.7.0 target:3.6.4
Keywords: regression
: 55625 (view as bug list)
Depends on:
Blocks: mab3.6
  Show dependency treegraph
 
Reported: 2012-09-14 01:06 UTC by Gary
Modified: 2013-03-08 11:36 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments
spreadsheet with pivot tables (27.85 KB, application/vnd.oasis.opendocument.spreadsheet)
2012-09-14 01:06 UTC, Gary
Details
same as above with more info. (29.49 KB, application/vnd.oasis.opendocument.spreadsheet)
2012-09-14 11:56 UTC, GerardF
Details
This is what I see. (20.58 KB, image/png)
2012-11-08 14:50 UTC, Kohei Yoshida
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Gary 2012-09-14 01:06:56 UTC
Created attachment 67132 [details]
spreadsheet with pivot tables

Problem description: 

Steps to reproduce:
1. add total times from spreadsheet
2. sum totals from job descriptions
3. ....

Current behavior:totals are incorrect

Expected behavior:expect totals to be correct

Platform (if different from the browser): 
              
Browser: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.1 (KHTML, like Gecko) Ubuntu/12.04 180.89 Chrome/21.0.1180.89 Safari/537.1Chromium/21.0.1
Comment 1 GerardF 2012-09-14 11:52:48 UTC
Reproducible with LibreOffice 3.6.1.2

Pivot table shows wrong result with hours data.
Comment 2 GerardF 2012-09-14 11:56:19 UTC
Created attachment 67149 [details]
same as above with more info.

Add new attachment.
Same file with table using formula and autofilter with sub.total function.

You can compare pivot table results with new table (or play with filter).
Comment 3 GerardF 2012-09-14 12:02:48 UTC
Not reproducible with Version 3.7.0.0.alpha0+ (Build ID: 628aad2)

Opening the file with Dev version and Refresh the pivot table gives me expected result.
Comment 4 pierre-yves samyn 2012-09-14 12:43:52 UTC
Hello

I confirm with Version 3.6.2.1 (Build ID: ba822cc) & Windows 7 64bits

Regards
Pierre-Yves
Comment 5 Julien Nabet 2012-10-13 14:30:13 UTC
On pc Debian x86-64, I tested on 3.6 and 3.5, 3.5 is OK, 3.6 wrong.
I increase the importance since it's a regression.
Comment 6 Julien Nabet 2012-10-13 14:33:06 UTC
*** Bug 55625 has been marked as a duplicate of this bug. ***
Comment 7 Julien Nabet 2012-11-07 12:41:27 UTC
Rainer/Michael: Bernard Nicoulaud (in cc of this bugtracker) sent me an email telling me that he was astonished that 3.6.3 was considered as enough stable and useable in production whereas there's still this bug. I must recognize he's quite right. Indeed, it's not a matter of an error message but that some results are wrong and it's easy not to notice it.
What do you think?
Comment 8 Michael Meeks 2012-11-07 16:05:19 UTC
> Rainer/Michael: Bernard Nicoulaud (in cc of this bugtracker) sent me an
> email telling me that he was astonished that 3.6.3 was considered as
> enough stable and useable in production whereas there's still this bug.

By production you mean some big enterprise ? they'd want to contract L3 support to get their bugs fixed to their timeline I imagine.

> I must recognize he's quite right. Indeed, it's not a matter of an error
> message but that some results are wrong and it's easy not to notice it.
> What do you think?

Clearly we need a fix and a unit test so it never comes back; patches very much appreciated. Until then - I CC Kohei who may have some ideas - it's been a MAB for nearly a month I guess.

The Pivot code is rather tricky, tangled and there are is no shortage of bugs that give wrong results in any given version of it for the last many years - we are slowly moving the bugs around, fixing them - and stopping known ones coming back with unit tests. Help from anyone doing QA earlier in the cycle to catch such things earlier is much appreciated.

Expanding our PivotTable unit tests is also not a terribly hard thing to do - cf.

http://cgit.freedesktop.org/libreoffice/core/tree/sc/qa/unit/ucalc.cxx#n1698

Might be nice if end-users could do that themselves; then again - the request to extend the calc unit tests fell didn't yield much. We essentially need complicated scenarios with minimal data (ie. small) test sheets as .ods files, with verified correct / matching .csv files (one per sheet) that we can drop into here:

http://cgit.freedesktop.org/libreoffice/core/tree/sc/qa/unit/data/ods
vs.
http://cgit.freedesktop.org/libreoffice/core/tree/sc/qa/unit/data/contentCSV

Perhaps we could do that for pivot-table corner-cases too: there are a lot of them. Help much appreciated.

Anyhow - sorry for the new bug - no doubt it is a side-effect of fixing another bug.
Comment 9 Kohei Yoshida 2012-11-08 14:23:15 UTC
I scan pivot table bugs with PIVOTTABLE in the summary. So, this helps me keep track of its bugs.
Comment 10 Kohei Yoshida 2012-11-08 14:48:28 UTC
Gary,

I need more clarification on which numbers are "wrong".  I've tried the master build, cross-check all the numbers displayed in the table, and all of them seem correct to my eye.

The numbers did change before and after the refresh.  But that can be caused by not refreshing the table after updating the raw data.

For example, the hour data for Dennis x 'Exterior Cleaning' before the refresh is 04:49.  After the refresh it becomes 05:15.  When checking the raw data and filtering manually, the correct answer here is 05:15.  It's also possible that the raw data had changed after the pivot table was generated and the table was never refreshed.  This can also result in inconsistent data between the table output and the raw data.
Comment 11 Kohei Yoshida 2012-11-08 14:50:28 UTC
Created attachment 69704 [details]
This is what I see.

This is the table output I see (after refresh).
Comment 12 GerardF 2012-11-08 15:05:15 UTC
Hi Kohei,

I confirm that this bug is no more in 3.7 (4.0 ?) branch.
At least, i do not reproduce with master~2012-10-20_14.21.35_LibO-Dev_3.7.0.0.alpha0_Win_x86_install_en-US

But still in Version 3.6.3.2 (Build ID: 58f22d5)
In my attachment (same data as Gary) you can see the DataPilot and a table with the same layout with expected results.
Comment 13 Kohei Yoshida 2012-11-08 15:09:28 UTC
(In reply to comment #12)
> Hi Kohei,
> 
> I confirm that this bug is no more in 3.7 (4.0 ?) branch.
> At least, i do not reproduce with
> master~2012-10-20_14.21.35_LibO-Dev_3.7.0.0.alpha0_Win_x86_install_en-US
> 
> But still in Version 3.6.3.2 (Build ID: 58f22d5)
> In my attachment (same data as Gary) you can see the DataPilot and a table
> with the same layout with expected results.

And you'll get that expected result when you refresh the table. As I mentioned, the table output and the raw data may go out of sync if the raw data has been edited since the table was initially generated, and it was never refreshed.  This is not a bug.

I still don't see a bug...
Comment 14 Kohei Yoshida 2012-11-08 15:24:26 UTC
(In reply to comment #12)

> But still in Version 3.6.3.2 (Build ID: 58f22d5)

Any chance you could test the latest on the 3.6.x branch (which will later become 3.6.4)?  There should be a daily build for the 3.6.x branch somewhere...
Comment 15 Kohei Yoshida 2012-11-08 15:39:55 UTC
Ah wait.  The 3.6 branch build produces a different result than what the master build does, even after the refresh.
Comment 16 Gary 2012-11-08 15:45:27 UTC
On 11/08/2012 08:48 AM, bugzilla-daemon@freedesktop.org wrote:
> Kohei Yoshida <mailto:kohei.yoshida@suse.de> changed bug 54898 
> <https://bugs.freedesktop.org/show_bug.cgi?id=54898>
> What 	Removed 	Added
> Status 	NEW 	NEEDINFO
>
> *Comment # 10 <https://bugs.freedesktop.org/show_bug.cgi?id=54898#c10> 
> on bug 54898 <https://bugs.freedesktop.org/show_bug.cgi?id=54898> from 
> Kohei Yoshida <mailto:kohei.yoshida@suse.de> *
> Gary,
>
> I need more clarification on which numbers are "wrong".  I've tried the master
> build, cross-check all the numbers displayed in the table, and all of them seem
> correct to my eye.
>
> The numbers did change before and after the refresh.  But that can be caused by
> not refreshing the table after updating the raw data.
>
> For example, the hour data for Dennis x 'Exterior Cleaning' before the refresh
> is 04:49.  After the refresh it becomes 05:15.  When checking the raw data and
> filtering manually, the correct answer here is 05:15.  It's also possible that
> the raw data had changed after the pivot table was generated and the table was
> never refreshed.  This can also result in inconsistent data between the table
> output and the raw data.
> ------------------------------------------------------------------------
> You are receiving this mail because:
>
>   * You reported the bug.
>
Hello,
I did not check all of the numbers.  The only reason that I even caught 
the error is that I had only one total time for Front Desk. It was 12 
hours. When I upgraded to 3.6 it would change the total to another 
number when I refreshed the pivot table.
Thank you,
Gary
Comment 17 Kohei Yoshida 2012-11-08 16:59:42 UTC
Ok.  Now I see the problem here.  It puzzles me why this only happens in the 3.6 build and not on master, as the pivot table code between the two are pretty much identical.
Comment 18 Kohei Yoshida 2012-11-08 18:43:27 UTC
I've found where the problem lies and how to fix it, which is good news.

The bad news is that I have no idea why, given the exact code between master and 3-6, the two builds generate different results.  Having said that, the code in question is relying on numerical equalities, which, due to rounding errors, may not work reliably at all times.
Comment 19 Not Assigned 2012-11-08 21:19:47 UTC
Kohei Yoshida committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=87003581c0cf1068172259c8b9949349fe7ff6b5

fdo#54898: Test equality by order index (integer) which is more stable.



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 20 Not Assigned 2012-11-09 06:39:01 UTC
Kohei Yoshida committed a patch related to this issue.
It has been pushed to "libreoffice-3-6":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=c23a44e1de0faad6328ed0ece7247da3ae24b6cd&g=libreoffice-3-6

fdo#54898: Test equality by order index (integer) which is more stable.


It will be available in LibreOffice 3.6.4.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 21 Michael Meeks 2012-11-09 10:04:09 UTC
Interesting - how does this interact with:

commit 0ee1609ddc04f01e7398fab08d74529c301600e6
Author: Kohei Yoshida <kohei.yoshida@gmail.com>
Date:   Wed Aug 22 23:33:14 2012 -0400

    bug#53929: Pivot table uses case insensitive string comparison.
    
    Change-Id: I65fa22ceeba37a15b70fe41b1dee26f1dde7d759
    Signed-off-by: Michael Meeks <michael.meeks@suse.com>

Did we get the unit test for that in 3.6 ? :-) and/or did we check that didn't regress ? :-) Anyhow - glad to be back to a fast index comparison there.

Should we mark this fixed if so ?
Comment 22 Kohei Yoshida 2012-11-09 14:29:06 UTC
(In reply to comment #21)
> Interesting - how does this interact with:
> 
> commit 0ee1609ddc04f01e7398fab08d74529c301600e6
> Author: Kohei Yoshida <kohei.yoshida@gmail.com>
> Date:   Wed Aug 22 23:33:14 2012 -0400
> 
>     bug#53929: Pivot table uses case insensitive string comparison.
>     
>     Change-Id: I65fa22ceeba37a15b70fe41b1dee26f1dde7d759
>     Signed-off-by: Michael Meeks <michael.meeks@suse.com>

In theory it shouldn't; the problem lies purely with comparing double-precision numbers, not string comparisons.

And that commit is both on 3-6 and master, and master worked just fine before my fix.

> 
> Did we get the unit test for that in 3.6 ? :-) and/or did we check that
> didn't regress ? :-) 

Weeell, yes and no.  I did write a test for this, but that test passes even without my fix.  I even tried it in the 3.6 branch.  So, having a test per se is not useful to catch this (but I did write a new one which is on master).

I'm still puzzled as to why the original code fails, and fails only on the 3.6 branch. But my fix has a benefit of its own; it will speed up the cache population process a little since we now avoid calling IsCaseInsEqual() altogether which is more expensive than simple integer comparison.

> Should we mark this fixed if so ?

Yes.  I'm doing that right now.
Comment 23 Morten Leikvoll 2013-03-08 11:09:32 UTC
Is this bug back in 4.0.1.2? It wont sum my hours with format hh:mm:ss correctly. I have an older file with correct sums in a pivottable, and when I refresh it, they all get wrong.
Comment 24 Morten Leikvoll 2013-03-08 11:36:33 UTC
(In reply to comment #23)
> Is this bug back in 4.0.1.2? It wont sum my hours with format hh:mm:ss
> correctly. I have an older file with correct sums in a pivottable, and when
> I refresh it, they all get wrong.

Sorry, I found the issue.. There is two similar datatypes, and only one works for hours sum. I used the "13:37:46" format wich I guess must be a "time" vs "876613:37:46" wich I guess is "duration". I guess the first one wraps at 24h..