Bug 149166 - Checking lots of remotely linked images takes a long time
Summary: Checking lots of remotely linked images takes a long time
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
7.4.4.2 release
Hardware: x86-64 (AMD64) All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: perf
Depends on:
Blocks: Calc-External-Datalink
  Show dependency treegraph
 
Reported: 2022-05-19 05:41 UTC by Matthew Millar
Modified: 2023-07-26 19:53 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
13,000+ HTML table with images present (29.28 MB, application/vnd.oasis.opendocument.spreadsheet)
2022-05-19 05:52 UTC, Matthew Millar
Details
13,000+ HTML table with images manually removed (25.35 MB, application/vnd.oasis.opendocument.spreadsheet)
2022-05-19 05:54 UTC, Matthew Millar
Details
Reduced example (897.51 KB, application/vnd.oasis.opendocument.spreadsheet-flat-xml)
2023-01-16 09:51 UTC, Buovjaga
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Matthew Millar 2022-05-19 05:41:07 UTC
Description:
Copying a modern 13,000+ row HTML table with multiple divs per cell for styling and inclusion of images, results in the copy taking multiple hours and save also taking an hour.

The issue appears to be related to the images, because editing the fods file in Notepad++ and removing the <table:shapes> element solves the performance issue.

Perhaps the default HTML table paste should remove images, with an option to either embed the images within the document as base64 data, or use an image cache/multi-threaded URI resolver?

Steps to Reproduce:
--Only attempt when you don't need LibreOffice for the next several hours--
1. Launch Microsoft Edge and navigate to CoinGecko "All Cryptocurrencies" list - https://www.coingecko.com/en/coins/all
2. Click "Show More" approximately 50 times, because modern websites are counterintuitive.
3. Highlight all Rows and click copy
4. Select A1 and paste
--Note: notice that the icons are also no longer embedded within the cells, but rather floated to coordinates approximately inline with the row--
5. Save and wait for another hour, for a  (compressed)/53MB (uncompressed) file
--Note: Autosave is the devil-incarnate, banish it before attempting--

Actual Results:
Mind-numbing slowness

Expected Results:
Not mind-numbing slowness


Reproducible: Always


User Profile Reset: No



Additional Info:
By comparison opening the uncompressed fods file into Notepad++ and resaving it to a new file took approximately 10 seconds to write to the new file.
Comment 1 Matthew Millar 2022-05-19 05:52:36 UTC
Created attachment 180207 [details]
13,000+ HTML table with images present

13,000+ HTML table with images present and load/save performance issues.

No performance improvement between fods and standard ods
Comment 2 Matthew Millar 2022-05-19 05:54:49 UTC
Created attachment 180208 [details]
13,000+ HTML table with images manually removed

13,000+ HTML table with images (<table:shapes> element) removed and load/save performance issues resolved.
Comment 3 Matthew Millar 2022-05-19 06:42:37 UTC
Additionally, it would be useful to introduce an open/save interrupt (ESC), so people have the choice to cancel an operation rather than forced to wait hours for the entire productivity suite to become usable again, unless biting the potential loss of data bullet of resorting to terminating the entire process.
Comment 4 Buovjaga 2023-01-16 09:51:26 UTC
Created attachment 184681 [details]
Reduced example

Trimmed down example file, which still takes a bit of time.

I think it checks each image link individually. The images are not shown for me because apparently I would have to be logged into the site. It gives Access Denied otherwise.

I think there are similar reports about loading remote images.
Comment 5 Buovjaga 2023-01-16 09:54:47 UTC
Arch Linux 64-bit, X11
Version: 7.4.4.2 / LibreOffice Community
Build ID: 40(Build:2)
CPU threads: 8; OS: Linux 6.1; UI render: default; VCL: kf5 (cairo+xcb)
Locale: fi-FI (fi_FI.UTF-8); UI: en-US
7.4.4-1
Calc: threaded