159180 – Add support for refined-doughnut ("sunburst") charts

Bug 159180 - Add support for refined-doughnut ("sunburst") charts

Summary: Add support for refined-doughnut ("sunburst") charts

Status:	NEW

Alias:	None

Product:	LibreOffice
Classification:	Unclassified
Component:	Chart (show other bugs)
Version: (earliest affected)	7.6.4.1 release
Hardware:	All All

Importance:	medium enhancement
Assignee:	Not Assigned

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:	Additional-Chart-Types Pie-and-Donut
	Show dependency tree / graph

Reported:	2024-01-14 22:35 UTC by Eyal Rozenberg
Modified:	2024-12-22 06:49 UTC (History)
CC List:	3 users (show)

See Also:	50934
Crash report or crash signature:

Attachments
Example 2-ring doughnut chart: Titanic survival by class (165.11 KB, image/png) 2024-01-14 22:36 UTC, Eyal Rozenberg	Details
Example 3-ring doughnut chart: Hierarchy of kinds of goods (65.42 KB, image/png) 2024-01-14 22:37 UTC, Eyal Rozenberg	Details
GNOME's disk usage analyser (157.88 KB, image/png) 2024-01-29 03:24 UTC, Stéphane Guillou (stragu)	Details
Comment 4 data visualised as sunburst in Excel (62.74 KB, image/png) 2024-01-29 03:31 UTC, Stéphane Guillou (stragu)	Details
XLSX with sunburst chart created with online Office 365 (13.12 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet) 2024-01-29 03:33 UTC, Stéphane Guillou (stragu)	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Eyal Rozenberg 2024-01-14 22:35:15 UTC

A "refined doughnut" chart is made up of concentric rings/doughnuts, similarly to the existing doughnut chart type. The difference is in what each ring signifies, and in the _increasing_ number of sections in the refined-doughnut chart type.

You see, in a regular doughnut chart type, one dimension is the ring section, and the other dimension is the ring index: The charted function is 

(row, column) -> datum

But in refined doughnut chart, the data is actually multi-dimensional, with each step outwards refining the data to another dimension; and the charted function is (key_1, key_2, ... key_d) -> datum .

For example, if my data table has columns: Region, Town, Neighborhood, NumBuildings - then the first ring will show the distribution of buildings by region; the second ring will refine each region's section into its different towns, and the third (and final) ring will refine that by neighborhoods, so that the outer ring has as many sections as there are table rows. 

This is especially useful in the two-dimensional case, when there is a single refinement, and one can color the inner ring in strong, contrasting colors, and color the refinement of each ring with a gradient of the same hue but different lightness or different saturation etc.

I'll attach a couple of examples.

Comment 1 Eyal Rozenberg 2024-01-14 22:36:56 UTC

Created attachment 191931 [details]
Example 2-ring doughnut chart: Titanic survival by class

Example 2-ring doughnut chart: Titanic survival by class

Comment 2 Eyal Rozenberg 2024-01-14 22:37:48 UTC

Created attachment 191932 [details]
Example 3-ring doughnut chart: Hierarchy of kinds of goods

Note the coloring in both attachments.

Comment 3 Heiko Tietze 2024-01-15 09:38:48 UTC

Donut charts are supported and multiple columns drawn as extra rings. Data from the first column come first. What exactly is missing?

Comment 4 Eyal Rozenberg 2024-01-15 10:11:00 UTC

(In reply to Heiko Tietze from comment #3)
> Donut charts are supported and multiple columns drawn as extra rings. Data
> from the first column come first. What exactly is missing?

You're describing the existing chart type, I'm describing a different chart type. They both use doughnuts, but differently. Please re-read the opening comment.

> Data from the first column come first.

In the refined-doughnut chart, there is no data in the first columns, they contain parts of the _key_. Take the second example chart I've attached. The rows might be:

Clothing,Shirts,sh1,10
Clothing,Shirts,sh2,10
Clothing,Shirts,sh3,8
Clothing,Pants,p1,14

etc. etc.

There is just one column of data. The doughnut rings display different levels of aggregation.

Comment 5 Heiko Tietze 2024-01-15 10:26:35 UTC

(In reply to Eyal Rozenberg from comment #4)
> There is just one column of data. The doughnut rings display different
> levels of aggregation.
You mean the chart should do all the pivot table work?

> (In reply to Heiko Tietze from comment #3)
> Please re-read the opening comment.
Please explain so it's easy to understand.

Comment 6 Eyal Rozenberg 2024-01-15 10:42:47 UTC

(In reply to Heiko Tietze from comment #5)
> (In reply to Eyal Rozenberg from comment #4)
> > There is just one column of data. The doughnut rings display different
> > levels of aggregation.
> You mean the chart should do all the pivot table work?

Actually, I'm not sure our PivotTables mechanism can do all of this at once, i.e. with gradually refined subtotals as we expose more key fields. Can it? I wonder if one can get Excel PivotTables to do it.

But - sort of.

> > (In reply to Heiko Tietze from comment #3)
> Please explain so it's easy to understand.

Suppose your table is:

L1	L2	L3	Num Items
Clothing	Shirts	sh1	10
Clothing	Shirts	sh2	10
Clothing	Shirts	sh3	8
Clothing	Pants	p1	14
Clothing	Pants	p2	5
Footware	Boots	b1	6
Footware	Boots	b3	7
Footware	Boots	b4	9

You can't have a 3-ring doughnut chart with this. But you can have a 3-ring refined doughnut chart with it. The first ring has two sections:

Clothing 47
Footware 22

the second ring has 3 sections:

Shirts 28
Pants  19
Boots  22

and the third ring has 8 sections with the full data column for their values. And the sections line up so that the ring 2 sections for clothes cover the same angular ranges as the Clothes section in ring 1 etc.

Comment 7 Heiko Tietze 2024-01-15 11:03:24 UTC

The Pivot table can produce subtotals, likewise this function too. You will be not happy with the workflow but in any case it's a two-step procedure where first data are processed and subsequently shown in a chart. Merging the two is not how spreadsheet tools in general work. => NAB/WF

Comment 8 Eyal Rozenberg 2024-01-15 12:42:56 UTC

(In reply to Heiko Tietze from comment #7)
> The Pivot table can produce subtotals, likewise this function too.

Not, to my knowledge, in a way in which selecting a PivotTable subrange and charting would produce such a chart. Am I wrong?

Moreover, there is the coloring of the chart, with outer rings being variations on the color of the inner ring (at least as an option). PivotTables don't do that.


> where first data are processed and subsequently shown in a chart.

That means that whenever data is updated, a reprocessing will be necessary. It's not like that with existing chart.

> You will
> be not happy with the workflow but in any case it's a two-step procedure

AFAICT, more than two steps.

And then - many, possibly dozens, of coloring steps.

> Merging the two is not how spreadsheet tools in general work.

I beg to differ. Many charts show percentages rather than absolute values. That too could be computed using an auxiliary column of data, or a pivot table with a computed field. So why have _those_ chart types? We could make to with only absolute value displays, and have the user take care of making sure everything sums up to 1.

Comment 9 Stéphane Guillou (stragu) 2024-01-29 03:24:48 UTC

Created attachment 192224 [details]
GNOME's disk usage analyser

In my opinion, this is more useful for when subcategories are exclusive to their main category (example in attachment 191932 [details]) whereas an example like attachment 191931 [details] (a good example of a "bad data visualisation" in my opinion) should be represented as a categorised bar plot in which one single colour is used per subcategory, or some kind of Sankey plot.

In any case, and regardless of how doughnuts and pies are generally not good for comparing values, I can see how it can be useful, and notice its popularity in infographics.

If it is implemented, I think it should also accommodate subcategories that don't sum up to 100%. It is for example used in that way in GNOME's disk usage anaylser (with the treemap as an alternative visualisation) - see attachment.

Comment 10 Stéphane Guillou (stragu) 2024-01-29 03:31:19 UTC

Created attachment 192225 [details]
Comment 4 data visualised as sunburst in Excel

Oh and by the way, this is usually called "sunburst" and it is supported by Excel. I'd say we should add it, at least for compatibility.

Comment 11 Stéphane Guillou (stragu) 2024-01-29 03:33:39 UTC

Created attachment 192226 [details]
XLSX with sunburst chart created with online Office 365

Comment 12 Eyal Rozenberg 2024-01-29 09:06:25 UTC

(In reply to Stéphane Guillou (stragu) from comment #9)
> this is ... useful for when subcategories are exclusive to their 
> main category

Certainly.

> an example like attachment 191931 [details] ...
> should be represented as a categorised bar plot 
> in which one single colour is used per subcategory

I'm not quite sure what you mean, but - that chart also has categories which are exclusive to their main category. The choice of colors in attachment 191931 [details] is kind of jarring, certainly.

 > If it is implemented, I think it should also accommodate subcategories that
> don't sum up to 100%.

I'm going to make that into a separate bug, because this is a non-trivial difference in chart behavior, and one can easily see an implementation supporting refined-doughnuts without the sunburst logic. What do you think?

Comment 13 BogdanB 2024-12-22 06:49:51 UTC

I marked the bug as New as a Chart enhancement, at least for Excel compatibility.