Bug 60700 - de-cruftify ODF files ...
Summary: de-cruftify ODF files ...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
4.0.0.3 release
Hardware: Other All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: reviewed:2022
Keywords: difficultyInteresting, easyHack, skillCpp, topicCleanup
Depends on:
Blocks: ODF-export-invalid
  Show dependency treegraph
 
Reported: 2013-02-11 21:54 UTC by Michael Meeks
Modified: 2023-08-07 02:30 UTC (History)
9 users (show)

See Also:
Crash report or crash signature:


Attachments
Template with toolbar inside (9.57 KB, application/vnd.oasis.opendocument.text-template)
2013-03-02 23:23 UTC, Regina Henschel
Details
test file saved in LibreOffice master with extra zero byte files (13.23 KB, application/vnd.oasis.opendocument.text)
2014-03-17 09:25 UTC, László Németh
Details
Bash script to test odts for zero byte files (225 bytes, text/plain)
2014-03-17 09:40 UTC, László Németh
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Meeks 2013-02-11 21:54:45 UTC
LibreOffice likes to produce a certain amount of bogosity in it's files.

If you create a blank writer document and save it you see:

unzip -l /tmp/empty.odt
Archive:  /tmp/empty.odt
  Length      Date    Time    Name
---------  ---------- -----   ----
       39  2013-02-11 21:48   mimetype
      906  2013-02-11 21:48   meta.xml
     9863  2013-02-11 21:48   settings.xml
     3241  2013-02-11 21:48   content.xml
      728  2013-02-11 21:48   Thumbnails/thumbnail.png
      899  2013-02-11 21:48   manifest.rdf
        0  2013-02-11 21:48   Configurations2/images/Bitmaps/
        0  2013-02-11 21:48   Configurations2/popupmenu/
        0  2013-02-11 21:48   Configurations2/toolpanel/
        0  2013-02-11 21:48   Configurations2/statusbar/
        0  2013-02-11 21:48   Configurations2/progressbar/
        0  2013-02-11 21:48   Configurations2/toolbar/
        0  2013-02-11 21:48   Configurations2/menubar/
        0  2013-02-11 21:48   Configurations2/accelerator/current.xml
        0  2013-02-11 21:48   Configurations2/floater/
    11252  2013-02-11 21:48   styles.xml
     1086  2013-02-11 21:48   META-INF/manifest.xml
---------                     -------
    28014                     17 files

Having a load of empty 'Configurations2/' directories - that - might (but don't) hold something useful is a bit sad / pointless.

It would be nice to remove (or not create) those directories unless they are needed. This probably has something twisted to do with the framework/ code hereabouts:

framework/inc/uiconfiguration/moduleuiconfigurationmanager.hxx:    class ModuleUIConfigurationManager

with this sort of thing:

framework/inc/uielement/uielementtypenames.hxx:#define UIELEMENTTYPE_FLOATINGWINDOW_NAME   "floater"

One gotcha to bear in mind is that the package2 (etc.) APIs we use to create and store streams are -horribly- awful; and creating sub-directories is really deeply unpleasant in packages. The lifecycle / 'commit' calling etc. on the interfaces is really poor - which perhaps is the reason why it is as bad as it is. Nevertheless if we can detect and avoid creating at least the sub-directories of Configurations2 we're winning I feel :-)

Thanks !
Comment 1 Regina Henschel 2013-03-02 17:46:17 UTC
The folder should hold the configuration, which is saved inside the document. That should be possible, but has errors in older versions and seems to be broken totally now. For details see https://issues.apache.org/ooo/show_bug.cgi?id=98123
Comment 2 Regina Henschel 2013-03-02 18:01:18 UTC
The discussion with Carsten Driesner, which is mentioned in the OO.o bug, is in http://www.mail-archive.com/search?l=users@de.openoffice.org&q=subject:%22[de-users]+Verkn%C3%BCpfung+Short+Cut+mit+Makro%22. But unfortunately it is in German.
Comment 3 Regina Henschel 2013-03-02 23:23:05 UTC
Created attachment 75808 [details]
Template with toolbar inside

To give you an example, where the Configuration2 folder contains something useful, I attach a template. It is a template for writing French documents. It contains a toolbar to insert the special French characters using macros. A document which is generated from this template, has the toolbar in Configuration2 folder too. So you can easily test, whether your solution to avoid the empty folder will not destroy the ability to store a toolbar inside a document.
Comment 4 Michael Meeks 2013-03-04 09:43:52 UTC
Hi Regina - thanks for your input ! :-)

I have no doubt that it is possible for these directories to hold something useful :-) surely they can.

However - when there is no keyboard customisation / configuration etc. (and I'm certain we havn't configured any keybindings / floaters etc. in this doc) - then we should not be creating empty directories that serve no useful purpose in the ODF file.

But of course, it'd be interesting to fixup those other problems too - thanks for the pointers ! :-)
Comment 5 Michael Meeks 2013-04-24 12:21:53 UTC
I read and tested Alex's patch here:

https://gerrit.libreoffice.org/#/c/3401/

And it appears to be a lot better - though we still end up with this empty directory:

0  Stored        0   0% 2013-04-24 08:31 00000000  Configurations2/images/Bitmaps/
0  Defl:N        2   0% 2013-04-24 08:31 00000000  Configurations2/accelerator/current.xml

But encouragingly we get the accelerators serialized as expected. I guess for a document opened read/write it is not needed to create those storages until the data is written; but ...

Any more explanation of the code change - much appreciated ...
Comment 6 Alex Ivan 2013-05-03 09:44:05 UTC
Hi,

First, sorry for such a late response.

The idea is that the storage elements are opened again in the UIConfigurationManager::storeToStorage method (same file), so, basically, if there is a need to store anything, they will be reopened as writeable. 

As you pointed out, my current patch does not purge these directories in files that already contain them, but are empty. 
I've found this documentation of the XStorage interface [1], but I am not quite sure how to go about determining if the storage element is empty. There surely is something I am missing. It would be very helpful if someone could point me in the right direction. 

[1]http://www.openoffice.org/api/docs/common/ref/com/sun/star/embed/XStorage.html
Comment 7 Michael Meeks 2013-05-03 10:30:52 UTC
Ok - you convinced me :-) good work - I pushed it.

To further improve this to delete empty directories you'd want to queryInterface on the XStorage for:

http://www.openoffice.org/api/docs/common/ref/com/sun/star/container/XNameAccess.html

and call getElementNames - clearly if there are none - it is empty; and then do some removeElement on the XStorage interface to clean them out :-)

Of course, that'd need some testing; and particularly with the flat-ODF filters: not sure how they work in this regard (prolly fine but... save as '.fodt' or '.fods' etc.).

Anyhow thanks for the nice fix ! and sorry for the slow review cycle.
Comment 8 Andras Timar 2013-11-18 12:18:16 UTC
commit da06166015689eca260c702602bef4cea58afbd3 caused a regression, it is not possible to save new custom toolbars and it is not possible to modify existing custom toolbars any longer. See bug 69500. Benefits of this commit do not outweigh the harm it causes, so it's better to revert it.
Comment 9 Michael Meeks 2013-11-21 10:23:34 UTC
As we re-fix this, we should clearly add a regression test for saving a toolbar =) Alex - any thoughts ?
Comment 10 Aron Szabo 2014-03-12 21:57:06 UTC
Beyond the matter whether it is nice or not, 0 byte-sized files mean a blocking problem at applying digital signature technology according to the related standards.

The ODF v1.2 says, that "Document signatures shall contain a <ds:Reference> element for each file within the package [...]". For interoperability reasons it is also highly recommended to use "ASiC-E with XAdES/CAdES" format of ETSI TS 102 918 standard which requires the signature to cover all the elements of the package. But in case of having 0 byte-sized files in the files-to-be-hashed-and-signed set, a strict CEN CWA 14170 requirement can not be met which says, that it is a threat, if a "signature is generated to a 'null' document".

Possible solutions for this issue are: e.g. simply deleting 0 byte-sized files from the package, or inserting some "dont-care" content into them such as XML comments, or an XML root element.
Comment 11 László Németh 2014-03-17 09:23:27 UTC
I have made a search for the zero byte files with the attached script. The result: Formulas with modified style has got zero current.xml files, too, and LibreOffice uses a zero file to store default dialog language. See the following files in the attached test document:

Object 1/Configurations2/accelerator/current.xml     
Dialogs/Standard/DialogStrings_en_US.default
Comment 12 László Németh 2014-03-17 09:25:46 UTC
Created attachment 95917 [details]
test file saved in LibreOffice master with extra zero byte files

check with unzip -v
Comment 13 László Németh 2014-03-17 09:40:19 UTC
Created attachment 95918 [details]
Bash script to test odts for zero byte files

This is for a quick test using locate to test all ODTs of the system. The script lists the invalid test files of xmlsecurity and qa, too:

 1 /home/laci/git/libo/sfx2/qa/complex/sfx2/testdocuments/CUSTOM.odt: mimetype     
      1 /home/laci/git/libo/xmlsecurity/test_docs/documents/invalid_ooo2_x_doc1.odt: Pictures/New Text Document.txt   
      1 /home/laci/git/libo/xmlsecurity/test_docs/documents/invalid_ooo2_x_macro1.odt: Basic/New Text Document.txt   
      1 /home/laci/git/libo/xmlsecurity/test_docs/documents/invalid_ooo2_x_macro2.odt: Basic/Standard/New Text Document.txt   
      1 /home/laci/git/libo/xmlsecurity/test_docs/documents/invalid_ooo2_x_macro3.odt: Basic/New Folder/New Text Document.txt  
      1 /home/laci/git/libo/xmlsecurity/test_docs/documents/invalid_ooo2_x_macro4.odt: Basic/Standard/New Folder/New Text Document.txt  
      1 /home/laci/git/libo/xmlsecurity/test_docs/documents/invalid_ooo3_2_doc1.odt: META-INF/New Text Document.txt   
      1 /home/laci/git/libo/xmlsecurity/test_docs/documents/invalid_ooo_3_2_doc2.odt: New Text Document.txt   
      1
Comment 14 Robinson Tryon (qubit) 2015-12-14 05:01:17 UTC Comment hidden (obsolete)
Comment 15 Robinson Tryon (qubit) 2016-02-18 14:51:35 UTC Comment hidden (obsolete)
Comment 16 Hossein 2022-11-24 16:31:14 UTC
Re-evaluating the EasyHack in 2022

This issue is still relevant. I have created a blank file with the latest LO 7.5 dev master, and the empty directories are still there.

As a fix was merged before, and caused a regression (see comment 8), extensive tests should be done to make sure that we do not break things including the backward compatibility.

Version: 7.5.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 4dc372c87b194c32597eda6a7f99ddcee3108fa6
CPU threads: 8; OS: Linux 5.15; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded

$ unzip -l ~/blank.odt 
Archive:  /home/hossein/blank.odt
  Length      Date    Time    Name
---------  ---------- -----   ----
       39  2022-11-24 16:13   mimetype
        0  2022-11-24 16:13   Configurations2/accelerator/
        0  2022-11-24 16:13   Configurations2/images/Bitmaps/
        0  2022-11-24 16:13   Configurations2/toolpanel/
        0  2022-11-24 16:13   Configurations2/progressbar/
        0  2022-11-24 16:13   Configurations2/floater/
        0  2022-11-24 16:13   Configurations2/statusbar/
        0  2022-11-24 16:13   Configurations2/toolbar/
        0  2022-11-24 16:13   Configurations2/popupmenu/
        0  2022-11-24 16:13   Configurations2/menubar/
      899  2022-11-24 16:13   manifest.rdf
      865  2022-11-24 16:13   meta.xml
     3419  2022-11-24 16:13   content.xml
    13400  2022-11-24 16:13   settings.xml
    12121  2022-11-24 16:13   styles.xml
      297  2022-11-24 16:13   Thumbnails/thumbnail.png
     1061  2022-11-24 16:13   META-INF/manifest.xml
---------                     -------
    32101                     17 files