Bug 137308 - import newer / zstandard as zip-format
Summary: import newer / zstandard as zip-format
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium enhancement
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: difficultyInteresting, easyHack, skillCpp
Depends on:
Blocks:
 
Reported: 2020-10-07 10:56 UTC by paulystefan
Modified: 2022-07-01 01:32 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description paulystefan 2020-10-07 10:56:59 UTC
zstandard is a fast compression archive format available as zip-Container since Data Version 6.3.8

It is much faster than older compression zip formats.

https://en.wikipedia.org/wiki/Zip_(file_format)

https://facebook.github.io/zstd/

And dictionary compression is for text applications ideal.

Zstandard is a fast compression algorithm, providing high compression ratios. It also offers a special mode for small data, called dictionary compression. The reference library offers a very wide range of speed / compression trade-off, and is backed by an extremely fast decoder (see benchmarks below). Zstandard library is provided as open source software using a BSD license. Its format is stable and published as IETF RFC 8478


There is no recovery record like winrar or parchive but it is faster.
Comment 1 Mike Kaganski 2020-10-07 11:18:25 UTC
Sigh. Please try reading before submitting ideas - especially when your previous idea (tdf#137305) has a reply with a link to the relevant part of the standard.

Specifically, [1]:

> An OpenDocument Package shall meet the following requirements:
> A)It shall be a Zip file, as defined by [ZIP]. All files contained in the
> Zip file shall be non compressed (STORED) or compressed using the “deflate”
> (DEFLATED) algorithm.

The standard does not allow any other algorithms. Doing otherwise would make the package non-conformant/invalid.

Your suggestion is not in the right place, again. You should submit it to OASIS, with the accompanying analysis of the gains that you expect to get, for typical documents (with text and images), both size-wise and performance-wise. It is WONTFOX here until the standard changes.

[1] http://docs.oasis-open.org/office/OpenDocument/v1.3/OpenDocument-v1.3-part2-packages.html#__RefHeading__752791_826425813
Comment 2 paulystefan 2020-10-07 11:25:35 UTC
but this is part of your work to improve LO.

Is LO-dev not in contact of oasis-organisation?

This enhancement ist not tomorrow but next year or decade or century or in millions of years.
Comment 3 Michael Meeks 2020-10-07 11:38:39 UTC
Let me re-open this and turn it into an easy hack.

Clearly we should be able to import files in newer ZIP formats such as zstandard.

My hope would be that libzip would (eventually) be able to do that for us; but anyhow - the inflator code is here - it explictly uses inflate.

https://git.libreoffice.org/core/+/refs/heads/master/package/source/zipapi/Inflater.cxx

Possibly we will want to include a new external module for zstd (I guess) to re-use that code - the BSD seems fine, and to add that to readlicense_oo.

Beyond that I think we'll want read support widely deployed for some years - before getting this into the standard, and then some more years before turning it on by default.

Contributions much appreciated =)
Comment 4 Mike Kaganski 2020-10-07 11:49:44 UTC
However, see also ISO/IEC 21320-1 "Document Container File — Part 1: Core", which, according to the Wikipedia article mentioned in comment 0, requires that "Files in ZIP archives may only be stored uncompressed, or using the "deflate" compression (i.e. compression method may contain the value "0" - stored or "8" - deflated)".
Comment 5 paulystefan 2020-10-07 15:32:16 UTC
zstandard is up to 4 times faster with modern hardware and with more compression possibility.

So for user a legal zip odt-mode for archiving (actual)

and a fast zip odt-mode with zstandard for working could be possible.

zstandard compression is also possible for internal automatic fast saving in a second parallel work file in session.

Huge files are the main target for this improvement.
Comment 7 Mike Kaganski 2020-10-16 10:49:21 UTC
(In reply to paulystefan from comment #6)
> so kernel starts 4 times faster.

Please don't turn this request into unmanageable advertising board (and no, "kernel starts 4 times faster" is wrong, 4 times improvements were reported for decompression, while boot time improvements were much more modest).
Comment 8 paulystefan 2020-10-26 15:32:07 UTC
ok.

I want only to show the actual way of other open source communities in this area.

For all compression and decompression in the framework of Libre Office, there is a potential with modern codecs like zstandard.

With Libre Office 7 some old software and hardware without modern cpu functions are gone.
So more is possible in this field also with new normality SSD with 500MB/s and more instead of HDD with 50 to 100 MB/s.

Benchmarks about this are available in internet searches by "7zip zstandard" and others.
Comment 9 paulystefan 2022-07-01 01:32:11 UTC
zstandard (actual Version 1.5.2) is perhaps also a possibility in next future for the installation files of LO for less size and faster installation.