Bug 142600 - Link to External Data Source dialogue should list HTML tables in order
Summary: Link to External Data Source dialogue should list HTML tables in order
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
6.2.5.2 release
Hardware: x86-64 (AMD64) Linux (All)
: medium enhancement
Assignee: Andreas Heinisch
URL:
Whiteboard: target:7.3.0
Keywords:
Depends on:
Blocks: Calc-External-Datalink
  Show dependency treegraph
 
Reported: 2021-06-01 12:19 UTC by stragu
Modified: 2021-07-31 22:00 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description stragu 2021-06-01 12:19:38 UTC
Description:
When using Sheet > Link to External Data... and pointing to a website that has more than 9 tables, the order of tables in the dialogue is not the same as in the source.

Steps to Reproduce:
1. Open Calc
2. Go to "Sheet > Link to External Data..."
3. Paste a URL in "URL of External Data Source", for example https://en.wikipedia.org/wiki/QS_World_University_Rankings
4. Press Enter on the keyboard
5. Use defaults in the "Import Options" dialog and click "OK".

Actual Results:
The tables are listed in the order:

HTML_1
HTML_10
HTML_11
HTML_12
HTML_2
HTML_3
etc.

Expected Results:
The tables should be listed in the same order as in the HTML page's source code.


Reproducible: Always


User Profile Reset: No



Additional Info:
This matters as you can pick several tables to import. If you pick a range or cherry-pick a number of them (using Ctrl + click), the user would expect to see them imported into the spreadsheet in the same order as in the original source.

A cumbersome workaround would be to:
- Import table 1
- Import the ones you want between 2 and 9
- Import the ones you want between 10 and 12

Alternatively, import HTML_tables and remove the ones you don't want to keep.

Tested on 3 versions:

Version: 7.2.0.0.alpha1+ / LibreOffice Community
Build ID: e718f0e703c0fb33a0b1b5efe7b13b02c25f3335
CPU threads: 8; OS: Linux 4.15; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
TinderBox: Linux-rpm_deb-x86_64@86-TDF, Branch:master, Time: 2021-05-30_21:49:59
Calc: threaded

Version: 7.1.3.2 / LibreOffice Community
Build ID: 47f78053abe362b9384784d31a6e56f8511eb1c1
CPU threads: 8; OS: Linux 4.15; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Calc: threaded

Version: 7.0.4.2
Build ID: 00(Build:2)
CPU threads: 8; OS: Linux 4.15; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Ubuntu package version: 1:7.0.4_rc2-0ubuntu0.18.04.2
Calc: threaded
Comment 1 Buovjaga 2021-06-03 14:24:45 UTC
Confirmed. Looks like LibO is sorting them alphanumerically before listing them.

Adding Andreas as he expressed interest in bug 127484

Arch Linux 64-bit
Version: 7.1.3.2 / LibreOffice Community
Build ID: 10(Build:2)
CPU threads: 8; OS: Linux 5.12; UI render: default; VCL: kf5
Locale: fi-FI (fi_FI.UTF-8); UI: fi-FI
7.1.3-1
Calc: threaded
Comment 2 Andreas Heinisch 2021-06-03 18:38:29 UTC
Should we just sort them by order of appearance or even add a right click context menu? Or would that be a little bit overkill?
Comment 3 stragu 2021-06-04 00:12:39 UTC
What would the right-click menu give as options, Andreas?
Comment 4 Andreas Heinisch 2021-06-04 05:15:30 UTC
Something like sort with a submenu including the two options alphabeitcally and by actual order.
Comment 5 stragu 2021-06-04 07:17:44 UTC
Wondering if this option would be better offered with a table format, with column headings:

name | caption | position in document

And the user can click on the headings to sort whichever way they want.

But that might mean "HTML_all" and "HTML_tables" need to be outside that table?
Comment 6 Andreas Heinisch 2021-06-04 07:44:31 UTC
Hm, not all tables may have a caption and even the html_all and html_tabkes need special handling. Imho it is a little bit overkill including all these sortings and headings. Maybe we should go just for sorting with document order, but maybe someone can create a small mockup how to display the tables.
Comment 7 stragu 2021-06-05 09:42:55 UTC
I'd be happy with a simple solution as a first step! :)
i.e. in order of appearance by default, and "HTML_tables" and "HTML_all" listed separately at the bottom.
Comment 8 Andreas Heinisch 2021-06-06 21:53:14 UTC
Proposed patch: https://gerrit.libreoffice.org/c/core/+/116767/1

Maybe there exists a better way to iterate over the indexes in the ranged name.
Comment 9 stragu 2021-06-10 03:52:11 UTC
Thanks Andreas for designing a solution.
Unfortunately, I don't have the expertise to review the change!
Comment 10 Buovjaga 2021-06-10 05:30:58 UTC
Minor note: keyword patch is used, if someone attaches a patch to the report or pastes it into a comment
Comment 11 stragu 2021-07-08 06:44:22 UTC
Also reproduced in earlier version 6.2.5:

Version: 6.2.5.2
Build ID: 1ec314fa52f458adc18c4f025c545a4e8b22c159
CPU threads: 4; OS: Linux 5.4; UI render: default; VCL: gtk3; 
Locale: en-AU (en_AU.UTF-8); UI-Language: en-US
Calc: threaded
Comment 12 Commit Notification 2021-07-21 15:46:49 UTC
Andreas Heinisch committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/462f9d1f589a7afd66d3fc61925467d3b68e5b31

tdf#142600 - List tables in order of their appearance

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 13 BogdanB 2021-07-23 19:29:02 UTC
Perfect. Thanks for fixing this bug.

Verified in
Version: 7.3.0.0.alpha0+ / LibreOffice Community
Build ID: 612d5b1a04fe022a34018d901bb9b052791d54e5
CPU threads: 4; OS: Linux 5.8; UI render: default; VCL: gtk3
Locale: ro-RO (ro_RO.UTF-8); UI: en-US
Calc: threaded
Comment 14 stragu 2021-07-28 12:55:38 UTC
Fabulous, thank you Andreas!

Verified as well in:

Version: 7.3.0.0.alpha0+ / LibreOffice Community
Build ID: 1dd4a80fa076bedb3a82821517036bad8dd79857
CPU threads: 4; OS: Linux 5.4; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
TinderBox: Linux-rpm_deb-x86_64@86-TDF, Branch:master, Time: 2021-07-26_22:41:19
Calc: threaded