Bug 168149 - Want the ability to apply SQL queries to a selected range
Summary: Want the ability to apply SQL queries to a selected range
Status: UNCONFIRMED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium enhancement
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Database-Queries Database
  Show dependency treegraph
 
Reported: 2025-08-27 22:58 UTC by Eyal Rozenberg
Modified: 2025-09-20 10:27 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Eyal Rozenberg 2025-08-27 22:58:52 UTC
One of the shortcomings which has always bothered me with MS Excel, and then Calc,  is that it puts the ability to apply complex queries to data far away from the user. While you can kinda, sorta, somehow, make it happen (there is some sort of embedded DBMS in there after all, right?) - it is certainly not easily accessible.

It would be quite useful if a user could select a range of cells, and considering it as a relational table - apply an SQL query to it, with the results populating a new sheet or someplace else on the same sheet.

There can be a lot of bikeshedding here: How this table should be named, whether to consider the first row to contain column names, what to do with repeated records, what dialect of SQL to use, whether the query is remembered after use or not, whether the results are dynamically linked to the source or not (maybe both options are interesting) and so on and so forth. 

But we could put ourselves squarely ahead of MS Excel for this aspect of data anlysis if we were to offer this functionality.
Comment 1 Heiko Tietze 2025-08-28 07:37:50 UTC
SQL query in Calc? And how do you think is Calc blocking Database/Database-Queries?

Cannot wrap my mind around this idea. You mean something like <FUNCTION> at <RANGE> for <TARGET> like "SUM at A1:A10 for B1". Why would SQL be suitable at all, thinking of Python/Basic etc.
Comment 2 Eyal Rozenberg 2025-08-28 11:28:12 UTC
(In reply to Heiko Tietze from comment #1)
> SQL query in Calc?

We have tables in calc, and we often want to generate new tables with derivative data. Pivot Tables work for simple cases, but not beyond that. But pivot tables have a lot of code supporting their dynamism; if we stick to just generating contents of cells, it should not be difficult - UI-wise at least - to allow for complex SELECT queries.

> And how do you think is Calc blocking Database/Database-Queries?

My "database" in calc is my selection. Tell me, how do I query it?

> Cannot wrap my mind around this idea. You mean something like <FUNCTION> at
> <RANGE> for <TARGET> like "SUM at A1:A10 for B1".

No, not a function. That might be an interesting direction, but I mean something more straightforward than that.

Example: You enter the following values

+---------+------------------+
| name    |  favourite color |
+---------+------------------+
| Alice   |  blue            |
+---------+------------------+
| Bob     |  green           |
+---------+------------------+
| Charlie |  pink            |
+---------+------------------+
| Daria   |  green           |
+---------+------------------+
| Eve     |  green           |
+---------+------------------+

Then, you select this cell range, then invoke an UNO command for applying an SQL query. In the dialog that opens up, you enter:

SELECT t1.name, t2.name
FROM selection_ AS t1, selection_ AS t2
WHERE t1.'favourite color' = t2.'favourite color' AND t1.name < t2.name;

and you also toggle the checkboxes "use first row as column names", and "results as new sheet" and press ok.

You now get a new sheet which contains

+---------+------------+
| t1.name |  t2.name   |
+---------+------------+
| Bob     | Daria      |
+---------+------------+
| Daria   | Eve        |
+---------+------------+
| Bob     | Eve        |
+---------+------------+
Comment 3 Regina Henschel 2025-08-28 11:32:33 UTC
The user can work with the embedded database as source instead of the spreadsheet. Then he can link to the tables and queries in the spreadsheet document. That gives the user the full SQL ability of a database and the complex calculations and charts abilities of a spreadsheet at the same time.

I do not support to extend the misuse of spreadsheets as databases.
Comment 4 Eyal Rozenberg 2025-08-28 11:37:58 UTC
(In reply to Regina Henschel from comment #3)
> The user can work with the embedded database as source instead of the
> spreadsheet.

The user has their data in a spreadsheet. They should not have to give that up just to run a query.

> Then he can link to the tables and queries in the spreadsheet
> document. 

But that's redundant and doesn't help the user who just wants to execute a query and get a table.

> That gives the user the full SQL ability of a database and the
> complex calculations and charts abilities of a spreadsheet at the same time.

But it's not necessary. 

> I do not support to extend the misuse of spreadsheets as databases.

I'm not suggesting people use spreadsheets as databases, I just want to apply a query or a transformation to data in a table, that's all.
Comment 5 Regina Henschel 2025-08-28 11:40:31 UTC
(In reply to Eyal Rozenberg from comment #2)

> SELECT t1.name, t2.name
> FROM selection_ AS t1, selection_ AS t2
> WHERE t1.'favourite color' = t2.'favourite color' AND t1.name < t2.name;

Why a complex SELECT statement where a simple filter (function or database range feature) is sufficient?
Comment 6 Eyal Rozenberg 2025-08-28 11:44:28 UTC
(In reply to Regina Henschel from comment #5)
> (In reply to Eyal Rozenberg from comment #2)
> 
> > SELECT t1.name, t2.name
> > FROM selection_ AS t1, selection_ AS t2
> > WHERE t1.'favourite color' = t2.'favourite color' AND t1.name < t2.name;
> 
> Why a complex SELECT statement where a simple filter (function or database
> range feature) is sufficient?

A simple filter is not sufficient, since I'm producing _pairs_ of records/rows: All pairs of people who like the same color.
Comment 7 Alex Thurgood 2025-08-28 14:03:01 UTC
This sounds like Excel's PowerQuery function ? I see from a cursory glance in the corresponding menu in Excel that you can choose an Excel sheet as the datasource with this functionality.
Comment 8 Eyal Rozenberg 2025-08-28 23:39:24 UTC
(In reply to Alex Thurgood from comment #7)
> This sounds like Excel's PowerQuery function ? I see from a cursory glance
> in the corresponding menu in Excel that you can choose an Excel sheet as the
> datasource with this functionality.

PowerQuery seems to involve a lot of UI rather than a simple dialog; and it's not contextual and on-the-fly'ish. And TBH I've not actually explored everything it can do. Let's call this feature low-key query :-)
Comment 9 Heiko Tietze 2025-09-09 09:55:30 UTC
(In reply to Regina Henschel from comment #3)
> I do not support to extend the misuse of spreadsheets as databases.
Me neither => WF
Comment 10 Eyal Rozenberg 2025-09-09 10:09:35 UTC
(In reply to Heiko Tietze from comment #9)
> (In reply to Regina Henschel from comment #3)
> > I do not support to extend the misuse of spreadsheets as databases.
> Me neither => WF

As I've noted above, this is not about using a spreadsheet as a database, so the premise for your opinion is invalid.

Please explain why you believe users should not be able to apply a query, in a query language intended for use on tables, on their table. In what way is that a "misuse"? It is a straightforward desire, which these days, people are forced to simulate using multiple pivottable generation, copying, and more pivot-tabling (and that's if they're lucky).
Comment 11 Heiko Tietze 2025-09-09 11:30:05 UTC
(In reply to Eyal Rozenberg from comment #10)
> As I've noted above, this is not about using a spreadsheet as a database, so
> the premise for your opinion is invalid.
From Wikipedia: Structured Query Language (SQL) is a domain-specific language used to manage data, especially in a relational database management system.
Comment 12 Eyal Rozenberg 2025-09-09 11:44:10 UTC
(In reply to Heiko Tietze from comment #11)
> From Wikipedia: Structured Query Language (SQL) is a domain-specific
> language used to manage data, especially in a relational database management
> system.

Indeed, it says "especially" in relational DBMSes, not "solely", or even "almost exclusively". Because SQL is more general than just that.

Now, SQL actually has several "sub-languages": For defining data, for altering data, etc. 

https://www.geeksforgeeks.org/sql/sql-ddl-dql-dml-dcl-tcl-commands/

What I'm talking about is the DQL, data querying language, or in other words - SELECT commands. These commands don't manage data, nor alter it; and I gave an example in an earlier comment.

You might also take a look at:

https://dev.to/qbentil/use-sql-without-databases-54on

for some inspiration (although that is not strictly SQL).
Comment 13 Eyal Rozenberg 2025-09-20 10:27:11 UTC
Bug 162202 makes a similar request, with some restrictions: Single-cell-level rather than a multi-cell-range level; and a limitation on the kinds of SQL queries one can make.

The limitation is actually a bit arbitrary, becaujse one could, in principle, place QUERY functions in every cell in a table - but that would be either much more expensive or require the overhead of realizing all of those cells are invoking the samr query, just taking different bits of the result.