Bug 91548 - FEATURE REQUEST: analysis toolpak for Calc
Summary: FEATURE REQUEST: analysis toolpak for Calc
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
(earliest affected)
Hardware: All All
: medium enhancement
Assignee: Not Assigned
Depends on:
Blocks: Data-Statistics
  Show dependency treegraph
Reported: 2015-05-23 21:20 UTC by Edmund Laugasson
Modified: 2023-09-29 20:01 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:

Screenshot (36.76 KB, image/png)
2015-05-23 23:52 UTC, m.a.riosv

Note You need to log in before you can comment on or make changes to this bug.
Description Edmund Laugasson 2015-05-23 21:20:57 UTC
Using 64-bit Linux Mint Cinnamon 17.1 with 4.0.4-040004-generic kernel
LibreOffice version:
Build ID: 0a16c3dda4150008d9be6f24cbd15ac198d116d3
Locale: et-EE (et_EE.UTF-8)

The long awaited feature which still today prevents scientists to take LibreOffice Calc into account is the missing analysis toolpak as it exist in MS Excel. This is really very expected by scientists! They just will never say it LibreOffice bugzilla usually.

Some links to bring an example as it exist in MS Excel:
* http://www.excel-easy.com/data-analysis/analysis-toolpak.html
* https://www.add-ins.com/Analysis_ToolPak.htm
* http://www.willamette.edu/~dnegri/courses/econ230/Data_Analysis_Toolpack_Guide.pdf
* http://www.microbiologybytes.com/maths/toolpak.html
* http://faculty-course.insead.edu/popescu/UDJCore/XtraMaterial/DATA%20ANALYSIS%20TOOLPAK.pdf

I would also propose to have closer contact with data analysis professors to ensure all needed functions exist.

I would propose to create appropriate chapter into Help describing the analysis toolpak in LibreOffice Calc. Even if these functions will be implemented one by one as expected then there would be needed one covering chapter in Help to concentrate all data analysis related information into one chapter.
Comment 1 m.a.riosv 2015-05-23 23:52:57 UTC
Created attachment 115921 [details]

Versión: Id. de compilación: 8a35821d8636a03b8bf4e15b48f59794652c68ba

Part of them since 4.2
Comment 2 Edmund Laugasson 2015-05-24 03:48:02 UTC
(In reply to m.a.riosv from comment #1)
> Created attachment 115921 [details]
> Screenshot
> LibreOffice:
> Versión: Id. de compilación: 8a35821d8636a03b8bf4e15b48f59794652c68ba
> Part of them since 4.2

That's true but if you look at the analysis toolpak in MS Excel then you will see full feature set. E.g. already ANOVA test are in MS Excel three different ways, in LibreOffice Calc only one way and so one.
Same goes to t-test. A lot of functions still missing.

I proposed also another feature set for scientists like Cronbach's Alpha - Bug 85318. But this Cronbach's Alpha possible to use already in MS Excel.
I would propose to implement all these reliability functions - http://www.real-statistics.com/reliability/ and not only - also whole Real Statistics Pack from http://www.real-statistics.com/ - then will be LibreOffice Calc better than MS Excel and definitely a good choice for scientists, because these real statistics functions are still missing even from MS Excel's Data Analysis Toolpak but are extremely important in nowadays science.

Definitely current data analysis set in LibreOffice Calc is a good start but still scientists claiming that it is not enough for them (especially they are looking those real statistics functions described above). I am not fully familiar with all statistics functions (hopefully yet) as I am not the data analysis professor but as far as I know there are not enough tools for scientists and this prevents to take LibreOffice in use at universities and other scientifical institutions and still they use MS Excel.

I could find from Help->Index the choice "statistics functions" and also "data statistics" but it would be very helpful if we could have one covering chapter of that, which brings together all that information.
Currently is great chapter "Database functionality" in Calc part of Help.
I would propose to create "Data Analysis functionality" chapter into Calc part of Help, which will cover all aspects of data analysis, including reliability - which is extremely important in nowadays science.
Comment 3 Edmund Laugasson 2015-05-24 04:01:04 UTC
I would emphasize the real statistics pack - http://www.real-statistics.com/ - as it contains really helpful functions, which are extremely needed in nowadays science - not only reliability but also other functions as well.

So we could say like there on the website - why do statistical analysis in LibreOffice Calc -> because it has already all necessary functions built-in!
Currently we cannot say it yet but the proposal would be to improve all the statistical, data analysis part so we could.

Then we could even say - why prefer LibreOffice Calc and not MS Excel - because LibreOffice Calc have not only all the analysis toolpak functionality but also all real statistics toolpak functionality, which MS Excel does not have by default.

There is also real statistics example workbook available - http://www.real-statistics.com/free-download/real-statistics-examples-workbook/

As there are scientists at the http://www.real-statistics.com/ - possibly there would be possible to cooperate with them so it would be easier to implement that toolpak also into LibreOffice Calc.

As it is possible to download it at http://www.real-statistics.com/free-download/real-statistics-resource-pack/ then there would be possible to investigate them and implement in Calc.
Comment 5 Edmund Laugasson 2015-05-24 12:32:25 UTC
Well in that case this bug could be like the metabug, collecting all other data analysis bugs or actually feature requests. The reason - these other bugs are not talking all of these aspects. I would also say that I am not the math or statistics professor and not familiar with all of these exact functions and its aspects but I know two of these packages (analysis toolpak and real statistics toolpak), which contain all the asked features of data analysis.

I am talking in the name of all scientists but they will not even look toward LibreOffice if there are not at least on the same level data analysis like in MS Excel.

The question is that actually there is a plan to find resources (money, people) to create that functionality but not yet clear when it will be happen. If there could be kind of H2020 project it would be nice. Not only statistics and data analysis but all LibreOffice functionality should be tested and teaching materials created. Then also appropriate teachers, professors would be involved and also missing features better described. As I am not the mathematician, it would not be good idea to describe by me all this. But you all who are receiving messages of current bug could help me. I try to do it also.

I guess there would be needed to set up kind of MS Excel instance and look inside it. Then install also that real statistics package and also this one. I guess what would be possible to do - copy help text and examples (if any) from MS Excel, make some pictures how it looks like and create bug report, actually feature request. Then there would be at least information of existing functionality. But I guess if programmer(s) would like to implement these functions, then there would be needed to test actual work and compare results with MS Excel. Then there must be mathematician available to give an expert rating of the results.

I guess kind of similar hunting session should be organized like LibreOffice bug hunting sessions are, where also appropriate teachers, professors are involved to create these functions with examples and tutorials.

There are MS Office 60-day and 90-day MS Windows trial versions legally available at http://www.microsoft.com/en-us/evalcenter/
Also there are preinstalled virtual machines available at http://dev.modern.ie/tools/vms/ - same trial versions but already preinstalled virtual machines and also for Linux and Mac OS.

I would propose to leave currently as unconfirmed but if you have better ideas - let us hear them.
Comment 6 m.a.riosv 2015-05-24 22:38:46 UTC
(In reply to Edmund Laugasson from comment #5)
> .....
> I am talking in the name of all scientists but they will not even look
> toward LibreOffice if there are not at least on the same level data analysis
> like in MS Excel.
> .....

I guess world scientists have a strong position, specially in the universities, to encourage and collaborate with the computer departments and their students to develop such analysis packages into LibreOffice. Being LibreOffice an open project I don't hesitate that all contributions are welcome.
Comment 7 Edmund Laugasson 2015-05-25 08:44:33 UTC
Well - scientists might be busy or satisfied with current MS Excel and do not bother themselves with that. Most scientists unfortunately do not care of software freedom but if there would be already the ready set of free tools and somebody will tell it then they would use it. But this caring starts from decisionmakers and this in turn starts when there would be free alternative to MS Excel analysis toolpack and real statistics toolpack. Then there would be possible to show to the decisionmakers that there are also completely free tools and we can spend research money to something better (for the real purpose) than just buying software licences....
This would be definitely a call to all people in universities who are close to developers and scientists to implement the analysis and real statistics toolpacks in LibreOffice Calc.
Comment 8 Jean-Baptiste Faure 2015-07-12 18:27:19 UTC
Set status to NEW because it is a valid enhancement. That does not mean it will implemented.

Note: I have many colleagues who do statistic computation on a daily basis and they will never use a spreadsheet to do that. They use R + Rstudio which are free, open-source and cross-platform software. So I am not sure if it is a good idea to duplicate effort to implement complex statistical functions in LibreOffice.

Best regards. JBF
Comment 9 Edmund Laugasson 2015-07-13 18:23:19 UTC
Why then MS Excel has it? As much as I have heard, a lot of scientists are using MS Excel in this reason that it has statistics inside and avoiding with same reason LibreOffice. R might be good but not all scientists are not ready to learn the code inside R. By the way there are many of such scientists. Therefore I started current feature request that there is strong requirement by such scientists. As common output is .csv from different programs, it is quite logical to import them into spreadsheet program. Why not use spreadsheet if the learning curve is much smaller than in case of R. There are many universities which still do not teach R but will teach spreadsheets (also statistics part in spreadsheets). This would reasonably increase LibreOffice chances to spread more widely if it would have good statistics features.
Comment 10 Stéphane Guillou (stragu) 2023-09-29 20:01:41 UTC
This report is hardly actionable, as we won't possibly have exact parity with e.g. MS Office. We do have a number of analyses in Data > Statistics, which was the initial request, and the corresponding help page is https://help.libreoffice.org/latest/en-US/text/scalc/01/statistics.html
Plus, we now have meta bug 111310 to track specific issues and enhancements related to our data statistics toolbox.
I agree with raal, let's close this, and open more precise requests blocking bug 111310, for example missing methods.
Thank you!