Bug 108631

Summary: ENHANCEMENT: Optimization of the file-save strategy
Product: LibreOffice Reporter: Telesto <telesto>
Component: filters and storageAssignee: Not Assigned <libreoffice-bugs>
Status: NEW ---    
Severity: enhancement CC: aron.budea, ilmari.lauhakangas
Priority: medium Keywords: perf
Version: Inherited From OOo   
Hardware: All   
OS: All   
See Also: https://bugs.documentfoundation.org/show_bug.cgi?id=84246
Whiteboard:
Crash report or crash signature: Regression By:
Bug Depends on:    
Bug Blocks: 108636    
Attachments: Example file

Description Telesto 2017-06-19 07:37:35 UTC
Description:
I'm not an developer, so I'm have no clue why it's working how it works. However it does seem that the file saving of large Calc files isn't that efficient. Especially for large spreadsheets (or Writer Documents)

When saving the sample file around 450 MB will be written to the disk for every regular and auto-save. As some sort of catching mechanism before saving. The exported file will be around 3,19 MB. However the caching mechanism will use all ssd drive write cycles pretty fast. 

Another oddity is that after saving the file gets loaded again. It's looking quite inefficient to me; especially for large files.

Excel is only writing the necessary stuff to the disk (as far I know of) 

Steps to Reproduce:
1. Open the attached file
2. Save a copy and monitor disk usage (process explorer)

Actual Results:  
- Around 450 MB gets written to the disk
- The saved file gets reloaded 

Expected Results:
- Less writes to the disk
- No reload (or not in this extend)


Reproducible: Always

User Profile Reset: No

Additional Info:
Version: 6.0.0.0.alpha0+
Build ID: cbf371e07fd5dea1ea08a1f299360d1273961ebd
CPU threads: 4; OS: Windows 6.19; UI render: default; 
TinderBox: Win-x86@42, Branch:master, Time: 2017-06-14_23:13:57
Locale: en-US (nl_NL); Calc: CL

and 3.0.0


User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0
Comment 1 Telesto 2017-06-19 07:38:22 UTC
Created attachment 134124 [details]
Example file
Comment 2 Aron Budea 2017-06-20 00:08:26 UTC
Confirmed with LO 5.4beta2.
For me it's even more than 450 MB, something close to the ~580 MB of content.xml.

I'm not touching severity, but I'd rather consider this a performance bug.
Comment 3 Aron Budea 2017-06-22 22:28:17 UTC
Eike and Markus told me it's likely something done in generic storage code, and not Calc-specific.
Comment 4 Aron Budea 2017-08-04 20:40:35 UTC
This commit should be relevant (clue from Markus):
https://cgit.freedesktop.org/libreoffice/core/commit/?id=f92183833fa569006602ac7e93c906d2094e0d4d
author		Matúš Kukan <matus.kukan@collabora.com>	2014-12-13 23:11:53 (GMT)
committer	Matúš Kukan <matus.kukan@collabora.com>	2014-12-13 23:21:20 (GMT)

"package: Better to use temporary files for huge memory zip streams

ZipPackageBuffer was holding the whole compressed data stream in one uno::Sequence which seems to be a lot for big documents in some cases."
Comment 5 Buovjaga 2019-04-19 11:01:29 UTC
https://bugs.documentfoundation.org/show_bug.cgi?id=113042#c24 mentions plans to work on zip compression. Perhaps this aspect can be dealt with as well.