Bug 110993 - Chart produces incorrect graph w/ some negative points when using large positive values in source cells
Summary: Chart produces incorrect graph w/ some negative points when using large posit...
Status: NEEDINFO
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Chart (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:7.3.0 target:7.2.0.0.beta2
Keywords:
Depends on:
Blocks: Chart
  Show dependency treegraph
 
Reported: 2017-08-01 14:12 UTC by xghost
Modified: 2021-11-28 13:59 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
LibreOffice Calc Chart Plotting Bug (43.97 KB, image/png)
2017-08-01 14:12 UTC, xghost
Details
Test Case Chart Document (27.09 KB, application/vnd.oasis.opendocument.spreadsheet)
2017-08-01 22:04 UTC, xghost
Details
Comparison of charts (63.44 KB, image/png)
2021-05-07 16:32 UTC, Andreas Heinisch
Details
Test Build Image (21.62 KB, image/png)
2021-05-11 11:11 UTC, Andreas Heinisch
Details
side-by-side LibreOffice and R graphs for 2^n cubic spline (242.75 KB, image/png)
2021-11-28 13:36 UTC, stragu
Details

Note You need to log in before you can comment on or make changes to this bug.
Description xghost 2017-08-01 14:12:14 UTC
Created attachment 135050 [details]
LibreOffice Calc Chart Plotting Bug

DESCRIPTION

When creating a chart with very large numbers in cells, the line graph produced is not entirely correct. While the values in the sheet cells are positive and appear correct, some of them get plotted to the Y-axis incorrectly.

While trying different scaling values for the Y-axis, I saw that some Y-axis entries were showing up as negative values instead of positive.


STEPS TO REPRODUCE

Create a data set meant to produce a graph to show how quickly common terms in time complexity grow as the values of N grow.

1. Create a column for values of n, numbered: 1, 10, 20, 30, ... 100
2. Create additional columns with formulas to calculate 2^n and n!, as in: O(2^n) and O(n!)
2.a. To calculate 2^n, use: =POWER(2, <cell>)
2.b. (OPTIONAL) Add a few smaller columns (e.g. n^2, n log n, etc) for comparison
3. Select all the data and create a scatter chart to plot all of them
4. Adjust the maximum values for X axis to 100 and Y axis to 1.4E+020
4.a. Double click the table, then the X or Y axis, then the Scale tab.
4.b. Uncheck the "Automatic" checkbox for "Maximum" and set 100


EXPECTED RESULT

The asymptote for 2^n should be one single and continuous, though significantly steeper relative to n^2, curve towards positive Y. The asymptote for n! should be the steepest. All values for 2^n, n!, etc. must be plotted on the positive Y-axis.


ACTUAL RESULT

There're some portions of the curves that drop below into the negative Y-axis, which is incorrect.

Notice that the cells containing the values used by the chart are all positive, but the chart does show the graph going through the negative Y-axis.
Comment 1 xghost 2017-08-01 14:13:46 UTC
Correction on step 4. The value for the Y-axis should be set to 1.4E+026, not 1.4E+020.

While setting it to 1.4E+020 will still show the problem, 1.4E+026 makes it more obvious.
Comment 2 Xisco Faulí 2017-08-01 14:36:33 UTC
Thank you for reporting the bug. Please attach a sample document, as this makes it easier for us to verify the bug. 
I have set the bug's status to 'NEEDINFO'. Please change it back to 'UNCONFIRMED' once the requested document is provided.
(Please note that the attachment will be public, remove any sensitive information before attaching it. 
See https://wiki.documentfoundation.org/QA/FAQ#How_can_I_eliminate_confidential_data_from_a_sample_document.3F for help on how to do so.)
Comment 3 xghost 2017-08-01 22:04:20 UTC
Created attachment 135058 [details]
Test Case Chart Document

Attached a Calc document with a dataset and 3 charts showing the issue. The issue is present in the 2^n and n! charts, with Y-axis scaling already set to a value that makes the issue (hopefully) more clear.
Comment 4 xghost 2017-08-01 22:06:21 UTC Comment hidden (obsolete)
Comment 5 xghost 2017-08-01 22:07:10 UTC Comment hidden (obsolete)
Comment 6 Xisco Faulí 2017-08-02 17:52:41 UTC
Confirmed in

Version: 6.0.0.0.alpha0+
Build ID: 50799a721c7ddcf9475a1b79984ed64ddd7cdf57
CPU threads: 4; OS: Linux 4.8; UI render: default; VCL: gtk3; 
Locale: ca-ES (ca_ES.UTF-8); Calc: group

- Version 4.1.0.0.alpha0+ (Build ID: efca6f15609322f62a35619619a6d5fe5c9bd5a)

- LibreOffice 3.3.0 
OOO330m19 (Build:6)
tag libreoffice-3.3.0.4

and also on Win
Comment 7 xghost 2017-09-17 19:04:47 UTC
Are there any updates regarding this issue?
Comment 8 xghost 2017-12-26 06:02:05 UTC
Has this issue been assigned some sort of priority/estimate? I've not seen any movement and don't even have an idea of the amount of effort it would take or if it's really "low-hanging fruit".
Comment 9 QA Administrators 2018-12-28 03:45:53 UTC Comment hidden (obsolete)
Comment 10 xghost 2019-01-01 23:14:27 UTC
I'm responding to confirm that this bug is still present. The LibreOffice installation is from the Kubuntu 18.10 repository.

Version: 6.1.2.1
Build ID: 1:6.1.2-0ubuntu1.1
CPU threads: 8; OS: Linux 4.18; UI render: default; VCL: gtk3_kde5; 
Locale: en-US (en_US.UTF-8); Calc: group threaded
Comment 11 xghost 2019-02-04 04:33:13 UTC
Guys, this bug is not going to magically disappear. There's a problem when creating charts with large numbers.

It has been 1.5+ years since this got reported and, based on bug report activity, there has been NO meaningful movement here.

If I had to take a guess, it looks like there's an overflow problem, based simply on how the graph is behaving...

If I'm wasting my time here with this, please do let me know, because it's not really fun or productive.
Comment 12 QA Administrators 2021-02-04 04:25:42 UTC Comment hidden (obsolete)
Comment 13 xghost 2021-02-04 06:43:30 UTC
I tested this again and the problem is still present. Version info:

Version: 7.0.3.1
Build ID: 00(Build:1)
CPU threads: 24; OS: Linux 5.8; UI render: default; VCL: x11
Locale: en-US (en_US.UTF-8); UI: en-US
Ubuntu package version: 1:7.0.3-0ubuntu0.20.10.1
Calc: threaded
Comment 14 Andreas Heinisch 2021-05-07 16:25:13 UTC
Very strange indeed, but I think this problem got fixed somewhere, because if I recreate the chart from scratch, it works as expected. However, the problem persist if the attached file is used. So imho this was a chart and export problem, where the wrong values now are stored in the attached file.

Version: 7.2.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: 0e812184825adf7bba178de03ddcdced044d2478
CPU threads: 6; OS: Windows 10.0 Build 19042; UI render: Skia/Raster; VCL: win
Locale: de-DE (de_DE); UI: en-US
Calc: CL
Comment 15 Andreas Heinisch 2021-05-07 16:32:24 UTC
Created attachment 171756 [details]
Comparison of charts

Charts created from scratch, which does not show the behaviour even after saving the document.
Comment 16 Andreas Heinisch 2021-05-07 16:59:38 UTC
I cannot reproduce the error in

Version: 6.0.2.1 (x64)
Build-ID: f7f06a8f319e4b62f9bc5095aa112a65d2f3ac89
CPU-Threads: 6; BS: Windows 10.0; UI-Render: Standard; 
Gebietsschema: de-DE (de_DE); Calc: CL

when creating such a chart from scratch. May you remember with which version you created these charts?

I tried to check the xml contents of both charts and this line seems to be the culprit:

WRONG Chart:
<style:chart-properties chart:interpolation="cubic-spline" chart:include-hidden-cells="false" chart:auto-position="true" chart:auto-size="true" chart:treat-empty-cells="ignore" chart:right-angled-axes="true"/>

FRESH RIGHT Chart:
<style:chart-properties chart:symbol-type="automatic" chart:include-hidden-cells="false" chart:treat-empty-cells="ignore" chart:right-angled-axes="true"/>

They only differ in the chart:interpolation="cubic-spline" attribute. Maybe you have added some interpolation?
Comment 17 xghost 2021-05-07 17:37:54 UTC
(In reply to Andreas Heinisch from comment #16)
> May you remember with which version you created these charts?

I don't remember a specific version, other than I was using whatever the latest stable release for Ubuntu was (via apt) at the date/time of reporting. There's a reply from someone confirming the issue the day after my report here on comment 6 (https://bugs.documentfoundation.org/show_bug.cgi?id=110993#c6).

It has version info for an alpha at the time, so, presumably, a stable Ubuntu 17.04 release before that. I hope that helps narrow it down and I apologize for failing to include version info in the initial report. I didn't expect it'd take so long before someone gave some attention to an issue like this.

Also, it's a bit concerning that the error would be stored within the document requiring it to be redone from scratch? (I guess it's not just a rendering issue.) For less trivial documents, this could be a more serious problem.

> Maybe you have added some interpolation?

No, I don't recall adding that manually nor a reason for wanting to do so, either. This is simply a Big-O notation table, so it's not meant to have any fancy effects beyond the basic graph.
Comment 18 Andreas Heinisch 2021-05-08 15:36:36 UTC
Now I could reproduce the problem from scratch.

If you choose the smooth line type in the scatter chart dialog with cubic spline interpolation, you get the wrong result. So there is for sure a problem with this interpolation method. I try to investigate further on this problem. Maybe I can provide a fix.
Comment 19 xghost 2021-05-08 20:55:08 UTC
(In reply to Andreas Heinisch from comment #18)
> Now I could reproduce the problem from scratch.

In what version, the older one closer to the report date or the latest?


> If you choose the smooth line type in the scatter chart dialog with cubic spline interpolation, you get the wrong result.

Maybe there's an option that causes/d that as a consequence/side-effect in my older document? Did your finding produce the same internal XML as my test document or does it seem to be a different way of reproducing the same result (i.e., same destination, but different road)?

> So there is for sure a problem with this interpolation method. I try to investigate further on this problem. Maybe I can provide a fix.

Thanks and good luck. If there's anything else I can provide assistance with, let me know, and I'll try to help out if it's within my means.
Comment 20 Andreas Heinisch 2021-05-11 09:47:31 UTC
The problem arises here: https://opengrok.libreoffice.org/xref/core/chart2/source/view/charttypes/Splines.cxx?r=e871c9cb#398 as a consequence of CalculateCubicSplines in https://opengrok.libreoffice.org/xref/core/chart2/source/view/charttypes/Splines.cxx?r=e871c9cb#534.

The calculation is split into x and y coordinates, and I have to clarify if this can be done for the derivatives of the function, because the x values have always the same size.
Comment 21 Andreas Heinisch 2021-05-11 11:11:34 UTC
Created attachment 171875 [details]
Test Build Image

I have revised the SplineCalculater::CalculateCubicSplines and this is the result. It turned out that the CubicSpline method had some logical as well as an algorithm error. I hope I can send it to gerrit and someone can review it :)
Comment 22 Andreas Heinisch 2021-05-12 08:59:09 UTC
After rechecking the calculation, I committed a patch to gerrit in order to correct a small error in the spline calculation. However, this does not solve the problem of this bug report.

The calculation of the spline should be correct, but the autoclipping to the chart does not work. I tried to recalc the min and max values for the chart in https://opengrok.libreoffice.org/xref/core/chart2/source/view/charttypes/AreaChart.cxx?r=4f6931d6#368 using:

std::vector< ExplicitScaleData > aScales( m_pPosHelper->getScales());
aScales[1].Minimum = *(std::min_element( aPoly.SequenceY[0].begin(), aPoly.SequenceY[0].end()));
aScales[1].Maximum = *(std::max_element(aPoly.SequenceY[0].begin(), aPoly.SequenceY[0].end()));
m_pPosHelper->setScales(aScales, m_pPosHelper->isSwapXAndY());

So the real bug here is that the automatic scaling of charts using cubic splines does not work. Unfortunately, I don't understand enough in order to solve this issue, but maybe someone with more insight in the chart creation process may continue.
Comment 23 Commit Notification 2021-06-26 08:38:56 UTC
Andreas Heinisch committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/e6442b39836f9856aa7b87d1a840158f0cb7d9c4

tdf#110993 - Corrected spline calculation

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 24 Commit Notification 2021-07-07 13:25:47 UTC
Andreas Heinisch committed a patch related to this issue.
It has been pushed to "libreoffice-7-2":

https://git.libreoffice.org/core/commit/d3280eb3b47472ab3fc6946d475a7b9c6c55c381

tdf#110993 - Corrected spline calculation

It will be available in 7.2.0.0.beta2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 25 stragu 2021-11-28 13:36:11 UTC
Created attachment 176554 [details]
side-by-side LibreOffice and R graphs for 2^n cubic spline

I have a very limited understanding of cubic splines, but I feel like this might be "not a bug".

I have tested building a cubic spline with R and plotting it with ggplot2 for comparison with what LibreOffice does. Attached is the result: both plots look the same.
The R code makes sure similar axis breaks are used. 500 values were interpolated to build the spline curve, using the spline() function and the method "normal".

R's spline() functions has various methods available. Most of them (fmm, periodic, natural) will generate negative values to fit a spline to this data. However, one method (hyman) will only generate values above 0 and be visualised in the hockey stick look (i.e. shooting up only towards the end, after the x = 80 mark).

As I understand it, LibreOffice's cubic spline smoother uses a normal cubic spline, whereas xghost was expecting a different kind. Arguably, the exact kind of interpolation used should be properly documented in our help pages.

xghost, are you able to produce a comparison with a different tool, or a reference that supports that LibreOffice's calculation is fundamentally wrong?