Created attachment 113932 [details] wrong data labels when expressed in percentages Create a new calc file, input any numeric data in a column. Then select the data you have entered and create a graph from them, the default type (bar chart is the name I think). Before closing the chart edit mode, with the context menu choose to show data labels, i.e. so that the numbers translated into the bars height are also written above each bar. As expected, the specific number associated to each bar is shown. Now use the context menu to format the labels, and choose to show these numbers as percentages: this is when things go wrong, as you will get a "100%" above all and each bar, instead of the relative quote that each numbers represents compared to their total sum. This is a regression: it did not happen with previous releases. Check attached file for an example.
I just wanted to stress that this is not another way to look at the data, since seeing "100%" above each and all the bars, no matter their size, is completely uninformative.
Reproducible with LO 4.4.1.2, Win 8.1. This option has only effects for the Pie chart. I am not sure whether this is a bug. For the Pie chart it makes sense, but I am not sure whether it makes sense for other chart types. But at least the help should mention that currently this feature is only calculated for Pie charts.
Created attachment 113974 [details] test file Hello, this is not a bug. Percentage is calculated per column, see attachment. Specify version where it works as you want - I tested with LO 3.5 and behaviour is still the same.
(In reply to raal from comment #3) > Created attachment 113974 [details] > test file > > Hello, > this is not a bug. Percentage is calculated per column, see attachment. > Specify version where it works as you want - I tested with LO 3.5 and > behaviour is still the same. Thanks for your feedback. Therefore, this means that this is no bug, but maybe the help can be developed a little bit further with some hints in the future, that the users can more easily understand how it works. But for me this is no real bug, because it already says "Displays the percentage of the data points in each column." @Andy & raal: Would you agree to close this as Not a Bug?
Yes it is true, this has always been this way, I probably mixed up some different memories. However, as a statistician, I must say that the way this works (giving percentages for the multiple grouped columns, like in your attachment) is not always the thing one would like to get: If the bar chart represents a frequency distribution, you expect percentages to tell you the relative frequency of each bar when compared to the whole. Even when you have columns in couples, like in your example, most of the times you are comparing 2 distributions, one made of all the blue rectangles, and the other one made of all the red ones. For example, the blue bars could represent the frequencies for age groups for males, and the red bars the similar distribution for females. In such a case, you would like to see the relative importance of a certain age group among males, and compare it with the same information for females. I would say this is the expected info at least as often as having the internal composition of the age group between males and females, as is actually shown. Best thing, of course, would be to be able to choose what is right for you between the two alternatives...
Dear Bug Submitter, This bug has been in NEEDINFO status with no change for at least 6 months. Please provide the requested information as soon as possible and mark the bug as UNCONFIRMED. Due to regular bug tracker maintenance, if the bug is still in NEEDINFO status with no change in 30 days the QA team will close the bug as INVALID due to lack of needed information. For more information about our NEEDINFO policy please read the wiki located here: https://wiki.documentfoundation.org/QA/FDO/NEEDINFO If you have already provided the requested information, please mark the bug as UNCONFIRMED so that the QA team knows that the bug is ready to be confirmed. Thank you for helping us make LibreOffice even better for everyone! Warm Regards, QA Team This NEEDINFO message was generated on: 2015-10-14
The request was concerning specifying an old version where the behaviour was different, I think. I am sorry but I am not really able to fulfill it, so I cannot assure if it is a regression or not. However, As I tried to explain, the data labels applied to a bar chart with a single series of bars in % format all = 100% make no sense; and even when the bar series are more than one, % would be more useful if measured the relevance of a bar within its series, not, as it is now of a bar within the bars of related to the same item. Of course, best thing would be being able to choose. Let me put out an example: suppose you are representing the distribution of some group with respect to political opinion; say this can be left, center-left, center-right or right. If you want to show this with a bar chart, and 23% of the group say ther're center-left oriented, you would like to see this figure at the top of the bar. Now suppose you have counted people by political opinion distinguishing males from females; you will have two bars for center-left, one for male and the other for females. If oyu want to compare opinion of males and females, you are more interested to know that, AMONG MALES, say 18% are center-left, while the same ratio AMONG FEMALES is instead higher, say 29%. What you will get now instead are 2 figures telling you that, AMONG THOSE WHO ARE CENTER-LEFT, 18/(18+29)= 38,3% are males and 61,7% are females. Thi could be of some interest as well, but the first information is IMHO more important, and it is the one expected in such case, I would say.
Hi Andy, After reviewing this bug thoroughly I am closing it as NOTABUG for two reason: 1) The charts work as designed; 2) There is a way to get the results the user is asking for - just need to learn how to use the charts right (which I suggest going to the user mailing list or ask.libreoffice.org for help) I've attached a document demonstrating that with the use case that you suggested (men and women) it is in fact possible to get the % as expected.
Created attachment 119718 [details] Correct Chart
Hi Joel and thanks for your attention. You're the master here, so I would not question your decision. However, I must say that the attachment you offered show exactly what I find dubious. LEt's look at the first chart: for example, the first data for men has a count of 20 out of the 300 males in total, while the same count for women is 12 out of a total of 128. Suppose the counts on this line are for leftist opinions: so the relevance of the left for men is 6,7% while it is 9,4% for women. I re-attached the file completing the table and showing the percentages expected in the graph in columns D and E, where instead we have the ratios between man and women for each item. At the same time, consider the case where the survey was made only on men, represented by the new CHART 3: here the % are not usable at all, while I would like to have 6,7%, 13,3% etc. instead of 100% 100%,....
Created attachment 119719 [details] discussing the data in charts
Hey Andy, Let's try this one (new attachment) that I think does what you are requesting. If it doesn't, please come into the QA channel and let's talk live :) http://webchat.freenode.net/?channels=libreoffice-qa
Created attachment 119720 [details] Another Good File
Created attachment 119723 [details] includes chart 3 and 4 with workaround
I have seen your new graph, and must thank you for all the effort you're putting into this. Really. Nonetheless, I think it is still not convincing. That's why: - first of all, I have to stress that my MAIN problem lies with the single group bar chart, i.e. the one where data in percentages produces a full row of 100%, 100%, on and on. This, to me is unquestionably inappropriate. - in the multiple group bar charts case, to which you refer, things are less clearcut. But your new solution is not OK yet to me, because you inverted the role of the classifier (gender) and of the items (political opinions in the example); now you have a 5-group bar chart with only two items that are the sexes, and you have to infer the item to which each bar is associated by looking at the colors in the legend, instead of having the items shown under the X-axis. This is decidedly non-standard in graphical representation of frequency data: THE standard is having items (categorical like in this example or numeric - e.g. income or whatever) shown under the x-axis and the classifier (gender) shown in the legend. However I think we should close the debate, which is taking really too much of our time: in fact, a workaround to get where I want does exist: you just have to draw the chart based directly on the percentages after having them computed in the sheet cells: in this way you get the same shapes in the chart, but you can show data labels without turning them into percentages, since the data are ALREADY %. This is shown in my revision of the attachment, with chart 3 showing the single group case, and chart 4 the double group case. I still believe the behaviour of the program should be changed, but one cannot pretend everything always goes according to his own wish, right?