Bug 108631 - ENHANCEMENT: Optimization of the file-save strategy
Summary: ENHANCEMENT: Optimization of the file-save strategy
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
(earliest affected)
Inherited From OOo
Hardware: All All
: medium enhancement
Assignee: Not Assigned
Keywords: perf
Depends on:
Blocks: Too-Much-File-Access
  Show dependency treegraph
Reported: 2017-06-19 07:37 UTC by Telesto
Modified: 2019-04-19 11:01 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:

Example file (3.20 MB, application/vnd.oasis.opendocument.spreadsheet)
2017-06-19 07:38 UTC, Telesto

Note You need to log in before you can comment on or make changes to this bug.
Description Telesto 2017-06-19 07:37:35 UTC
I'm not an developer, so I'm have no clue why it's working how it works. However it does seem that the file saving of large Calc files isn't that efficient. Especially for large spreadsheets (or Writer Documents)

When saving the sample file around 450 MB will be written to the disk for every regular and auto-save. As some sort of catching mechanism before saving. The exported file will be around 3,19 MB. However the caching mechanism will use all ssd drive write cycles pretty fast. 

Another oddity is that after saving the file gets loaded again. It's looking quite inefficient to me; especially for large files.

Excel is only writing the necessary stuff to the disk (as far I know of) 

Steps to Reproduce:
1. Open the attached file
2. Save a copy and monitor disk usage (process explorer)

Actual Results:  
- Around 450 MB gets written to the disk
- The saved file gets reloaded 

Expected Results:
- Less writes to the disk
- No reload (or not in this extend)

Reproducible: Always

User Profile Reset: No

Additional Info:
Build ID: cbf371e07fd5dea1ea08a1f299360d1273961ebd
CPU threads: 4; OS: Windows 6.19; UI render: default; 
TinderBox: Win-x86@42, Branch:master, Time: 2017-06-14_23:13:57
Locale: en-US (nl_NL); Calc: CL

and 3.0.0

User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0
Comment 1 Telesto 2017-06-19 07:38:22 UTC
Created attachment 134124 [details]
Example file
Comment 2 Aron Budea 2017-06-20 00:08:26 UTC
Confirmed with LO 5.4beta2.
For me it's even more than 450 MB, something close to the ~580 MB of content.xml.

I'm not touching severity, but I'd rather consider this a performance bug.
Comment 3 Aron Budea 2017-06-22 22:28:17 UTC
Eike and Markus told me it's likely something done in generic storage code, and not Calc-specific.
Comment 4 Aron Budea 2017-08-04 20:40:35 UTC
This commit should be relevant (clue from Markus):
author		Matúš Kukan <matus.kukan@collabora.com>	2014-12-13 23:11:53 (GMT)
committer	Matúš Kukan <matus.kukan@collabora.com>	2014-12-13 23:21:20 (GMT)

"package: Better to use temporary files for huge memory zip streams

ZipPackageBuffer was holding the whole compressed data stream in one uno::Sequence which seems to be a lot for big documents in some cases."
Comment 5 Buovjaga 2019-04-19 11:01:29 UTC
https://bugs.documentfoundation.org/show_bug.cgi?id=113042#c24 mentions plans to work on zip compression. Perhaps this aspect can be dealt with as well.