Bug 126573 - Add array functions in Calc: FILTER, SORT, SORTBY, UNIQUE, SEQUENCE, RANDARRAY
Summary: Add array functions in Calc: FILTER, SORT, SORTBY, UNIQUE, SEQUENCE, RANDARRAY
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: high enhancement
Assignee: Not Assigned
URL:
Whiteboard: Interoperability
Keywords:
: 127809 147224 (view as bug list)
Depends on:
Blocks: ODF-spec Excel-Functions
  Show dependency treegraph
 
Reported: 2019-07-26 22:10 UTC by Gerry
Modified: 2022-09-10 16:52 UTC (History)
21 users (show)

See Also:
Crash report or crash signature:
Regression By:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Gerry 2019-07-26 22:10:04 UTC
Description:
There are some array features in spreadsheets which recently have been added to Excel. These functions make lots of sense and they remind me of statistical software (e.g. Stata) which have very similar functions:

FILTER, SORT, SORTBY, UNIQUE, SEQUENCE, RANDARRAY

https://support.office.com/en-us/article/new-array-functions-003df6c7-1dcb-4388-8e2e-0fe77a0887bc?ui=en-US&rs=en-US&ad=US

Steps to Reproduce:
.

Actual Results:
.

Expected Results:
.


Reproducible: Always


User Profile Reset: No



Additional Info:
Comment 1 Eike Rathke 2019-08-05 10:45:35 UTC
Note that those functions aren't fully specified yet, for example https://support.office.com/en-us/article/RANDARRAY-function-21261e55-3bec-4885-86a6-8b0a47fd4d33 says:

Note: January 14, 2019: The RANDARRAY function is one of several beta features, and currently only available to a portion of Office Insiders at this time. We'll continue to optimize these features over the next several months. When they're ready, we'll release them to all Office Insiders, and Office 365 subscribers.


Also, it needs to be investigated how the spilling situations are actually handled and what happens if due to recalculation the return array size varies. This is not just new spreadsheet functions but a new behaviour that calculating a function can change spreadsheet content.

Also restrictions like such formula can be edited only at the top left of its array need to be implemented, though we maybe could handle that differently as long as the edited formula affects the range as if it was edited at the top left. Our current array formula display/edit behaviour already is somewhat similar.
Comment 2 raal 2019-10-02 18:46:32 UTC
*** Bug 127809 has been marked as a duplicate of this bug. ***
Comment 3 Jürgen Kirsten 2020-06-14 11:51:18 UTC
I would like to support Gerry's cause.

In my opinion also the XLOOKUP function would be very helpful. It returns an array if needed.

XLOOKUP, FILTER, SORT, UNIQUE 

are my wishes.

Juergen
Comment 4 flywire 2021-06-13 10:10:04 UTC Comment hidden (me-too)
Comment 5 Philip Sch 2021-07-04 10:06:17 UTC
These are not just nice to have anymore, some or all of those functions are already used in Google Sheets for a long time and needed for intercompatability, and are not Office Insiders exclusive anymore.
Comment 7 PremPa 2021-07-28 09:59:07 UTC Comment hidden (me-too)
Comment 9 Jacob H 2022-05-11 16:59:50 UTC
I add my support for these functions.

Over the years I've come across the limitations of array handling a number of times. Most often I've found wacky workarounds with compromises on functionality and likely performance. However, there are instances where the limitations are simply too constricting making the ideal handling of arrays impossible. I'll share the example I'm currently facing later, but provide a general use case in favor of the need of these functions.

I'll start by saying that dynamic Array manipulation is critical in handling data and has so many use cases. In my opinion the dynamic handling of arrays is the only remaining missing link in both function and form, at least as far as data handling in Calc.

One general limitation is that you can't take 2 arrays and interweave the data filtered on criteria, and then sorted to be in a proper order. Libreoffice Calc has the tools to do this manually, but if the document has ongoing changes, that manual effort becomes extraordinarily time consuming and tedious.

-----------
The rest of this post will detail real world limitations
-----------

So my current example, which could be solved by these functions being implemented, goes as follows:

I have a budget/finance system which has separate sheets as ledgers for income, expenses, transfers between accounts. There are then other sheets which extrapolate as much data as possible from those ledgers allowing you to see things like totals from different income types, various expenses broken down, averages per month, detailed account information, stock information, and on and and and on. (This requires some not as wacky array workarounds, but it works.)

Now stocks are where some wacky array workarounds start. If I want to pluck the purchase of a stock out of the Expenses Sheet (and all related information), I need a named range/formula expression called, "ExpensesToStock" with: IF(ExpensesType="Stock">0,(ExpensesType="Stock")*(ROW(ExpensesType)),"") Where "ExpensesType" is the column titled Type in the Expenses Sheet. "ExpensesToStock" is now an array of row numbers where that data is in the Expenses Sheet, and blank spaces where the data isn't. Then I remove the blank spaces on a separate configuration sheet by grabbing the "Small" of those row numbers out of "ExpensesToStock" by putting this function in consecutive rows: {=IFERROR(SMALL(ExpensesToStock,ROW($A1)),"")} Each consecutive row then increments as a variable for the small. (ROW($A1),ROW($A2),...) Now that I have a list of rows where the data is, I can run an "Indirect" function to display all of the relevant data on a separate sheet called "Stocks." That's a filtered ledger of only stocks that I can then do additional calculations on to provide even more stock relevant data.

^ TL;DR, Functions automatically copy each Stock line out of the "Expenses" sheet into the "Stocks" sheet and calculate additional important information related to the Stocks.

However, here is where the limitations of Calc array handling arise. Stocks aren't just an expense. Stocks can be an expense, income, or a merging between companies. If I wanted to make a dynamic ledger to determine proper data (total stocks, total basis, etc.) I need to evaluate both the purchase, sale, and transfer from the income and expenses sheet for all stock related information. You can imagine that the stock purchase and sale date are pretty important for the order of calculating. 

Combining both the purchase and sale of stocks plucked from 2 different sheets into a single date sorted ledger would be trivial and automatic with a filter and sort array function, but currently impossible in Calc short of constant manual efforts.
Comment 10 Roman Kuznetsov 2022-05-17 15:52:37 UTC Comment hidden (obsolete)
Comment 11 Roman Kuznetsov 2022-05-17 15:53:45 UTC
(In reply to Roman Kuznetsov from comment #10)
> Please don't change the Summary. There is a different bug 127293 for XLOOKUP
> implementation
Comment 12 Mike Kaganski 2022-05-22 09:06:24 UTC
*** Bug 147224 has been marked as a duplicate of this bug. ***
Comment 13 Rafael Lima 2022-07-07 20:25:30 UTC Comment hidden (me-too)