Bug 60700 - de-cruftify ODF files ...
Summary: de-cruftify ODF files ...
Status: ASSIGNED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
4.0.0.3 release
Hardware: Other All
: medium normal
Assignee: David Hashe
URL:
Whiteboard: reviewed:2022 target:25.8.0
Keywords: difficultyInteresting, easyHack, skillCpp, topicCleanup
Depends on:
Blocks: ODF-export-invalid
  Show dependency treegraph
 
Reported: 2013-02-11 21:54 UTC by Michael Meeks
Modified: 2025-04-29 06:20 UTC (History)
10 users (show)

See Also:
Crash report or crash signature:


Attachments
Template with toolbar inside (9.57 KB, application/vnd.oasis.opendocument.text-template)
2013-03-02 23:23 UTC, Regina Henschel
Details
test file saved in LibreOffice master with extra zero byte files (13.23 KB, application/vnd.oasis.opendocument.text)
2014-03-17 09:25 UTC, László Németh
Details
Bash script to test odts for zero byte files (225 bytes, text/plain)
2014-03-17 09:40 UTC, László Németh
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Meeks 2013-02-11 21:54:45 UTC
LibreOffice likes to produce a certain amount of bogosity in it's files.

If you create a blank writer document and save it you see:

unzip -l /tmp/empty.odt
Archive:  /tmp/empty.odt
  Length      Date    Time    Name
---------  ---------- -----   ----
       39  2013-02-11 21:48   mimetype
      906  2013-02-11 21:48   meta.xml
     9863  2013-02-11 21:48   settings.xml
     3241  2013-02-11 21:48   content.xml
      728  2013-02-11 21:48   Thumbnails/thumbnail.png
      899  2013-02-11 21:48   manifest.rdf
        0  2013-02-11 21:48   Configurations2/images/Bitmaps/
        0  2013-02-11 21:48   Configurations2/popupmenu/
        0  2013-02-11 21:48   Configurations2/toolpanel/
        0  2013-02-11 21:48   Configurations2/statusbar/
        0  2013-02-11 21:48   Configurations2/progressbar/
        0  2013-02-11 21:48   Configurations2/toolbar/
        0  2013-02-11 21:48   Configurations2/menubar/
        0  2013-02-11 21:48   Configurations2/accelerator/current.xml
        0  2013-02-11 21:48   Configurations2/floater/
    11252  2013-02-11 21:48   styles.xml
     1086  2013-02-11 21:48   META-INF/manifest.xml
---------                     -------
    28014                     17 files

Having a load of empty 'Configurations2/' directories - that - might (but don't) hold something useful is a bit sad / pointless.

It would be nice to remove (or not create) those directories unless they are needed. This probably has something twisted to do with the framework/ code hereabouts:

framework/inc/uiconfiguration/moduleuiconfigurationmanager.hxx:    class ModuleUIConfigurationManager

with this sort of thing:

framework/inc/uielement/uielementtypenames.hxx:#define UIELEMENTTYPE_FLOATINGWINDOW_NAME   "floater"

One gotcha to bear in mind is that the package2 (etc.) APIs we use to create and store streams are -horribly- awful; and creating sub-directories is really deeply unpleasant in packages. The lifecycle / 'commit' calling etc. on the interfaces is really poor - which perhaps is the reason why it is as bad as it is. Nevertheless if we can detect and avoid creating at least the sub-directories of Configurations2 we're winning I feel :-)

Thanks !
Comment 1 Regina Henschel 2013-03-02 17:46:17 UTC
The folder should hold the configuration, which is saved inside the document. That should be possible, but has errors in older versions and seems to be broken totally now. For details see https://issues.apache.org/ooo/show_bug.cgi?id=98123
Comment 2 Regina Henschel 2013-03-02 18:01:18 UTC
The discussion with Carsten Driesner, which is mentioned in the OO.o bug, is in http://www.mail-archive.com/search?l=users@de.openoffice.org&q=subject:%22[de-users]+Verkn%C3%BCpfung+Short+Cut+mit+Makro%22. But unfortunately it is in German.
Comment 3 Regina Henschel 2013-03-02 23:23:05 UTC
Created attachment 75808 [details]
Template with toolbar inside

To give you an example, where the Configuration2 folder contains something useful, I attach a template. It is a template for writing French documents. It contains a toolbar to insert the special French characters using macros. A document which is generated from this template, has the toolbar in Configuration2 folder too. So you can easily test, whether your solution to avoid the empty folder will not destroy the ability to store a toolbar inside a document.
Comment 4 Michael Meeks 2013-03-04 09:43:52 UTC
Hi Regina - thanks for your input ! :-)

I have no doubt that it is possible for these directories to hold something useful :-) surely they can.

However - when there is no keyboard customisation / configuration etc. (and I'm certain we havn't configured any keybindings / floaters etc. in this doc) - then we should not be creating empty directories that serve no useful purpose in the ODF file.

But of course, it'd be interesting to fixup those other problems too - thanks for the pointers ! :-)
Comment 5 Michael Meeks 2013-04-24 12:21:53 UTC
I read and tested Alex's patch here:

https://gerrit.libreoffice.org/#/c/3401/

And it appears to be a lot better - though we still end up with this empty directory:

0  Stored        0   0% 2013-04-24 08:31 00000000  Configurations2/images/Bitmaps/
0  Defl:N        2   0% 2013-04-24 08:31 00000000  Configurations2/accelerator/current.xml

But encouragingly we get the accelerators serialized as expected. I guess for a document opened read/write it is not needed to create those storages until the data is written; but ...

Any more explanation of the code change - much appreciated ...
Comment 6 Alex Ivan 2013-05-03 09:44:05 UTC
Hi,

First, sorry for such a late response.

The idea is that the storage elements are opened again in the UIConfigurationManager::storeToStorage method (same file), so, basically, if there is a need to store anything, they will be reopened as writeable. 

As you pointed out, my current patch does not purge these directories in files that already contain them, but are empty. 
I've found this documentation of the XStorage interface [1], but I am not quite sure how to go about determining if the storage element is empty. There surely is something I am missing. It would be very helpful if someone could point me in the right direction. 

[1]http://www.openoffice.org/api/docs/common/ref/com/sun/star/embed/XStorage.html
Comment 7 Michael Meeks 2013-05-03 10:30:52 UTC
Ok - you convinced me :-) good work - I pushed it.

To further improve this to delete empty directories you'd want to queryInterface on the XStorage for:

http://www.openoffice.org/api/docs/common/ref/com/sun/star/container/XNameAccess.html

and call getElementNames - clearly if there are none - it is empty; and then do some removeElement on the XStorage interface to clean them out :-)

Of course, that'd need some testing; and particularly with the flat-ODF filters: not sure how they work in this regard (prolly fine but... save as '.fodt' or '.fods' etc.).

Anyhow thanks for the nice fix ! and sorry for the slow review cycle.
Comment 8 Andras Timar 2013-11-18 12:18:16 UTC
commit da06166015689eca260c702602bef4cea58afbd3 caused a regression, it is not possible to save new custom toolbars and it is not possible to modify existing custom toolbars any longer. See bug 69500. Benefits of this commit do not outweigh the harm it causes, so it's better to revert it.
Comment 9 Michael Meeks 2013-11-21 10:23:34 UTC
As we re-fix this, we should clearly add a regression test for saving a toolbar =) Alex - any thoughts ?
Comment 10 Aron Szabo 2014-03-12 21:57:06 UTC
Beyond the matter whether it is nice or not, 0 byte-sized files mean a blocking problem at applying digital signature technology according to the related standards.

The ODF v1.2 says, that "Document signatures shall contain a <ds:Reference> element for each file within the package [...]". For interoperability reasons it is also highly recommended to use "ASiC-E with XAdES/CAdES" format of ETSI TS 102 918 standard which requires the signature to cover all the elements of the package. But in case of having 0 byte-sized files in the files-to-be-hashed-and-signed set, a strict CEN CWA 14170 requirement can not be met which says, that it is a threat, if a "signature is generated to a 'null' document".

Possible solutions for this issue are: e.g. simply deleting 0 byte-sized files from the package, or inserting some "dont-care" content into them such as XML comments, or an XML root element.
Comment 11 László Németh 2014-03-17 09:23:27 UTC
I have made a search for the zero byte files with the attached script. The result: Formulas with modified style has got zero current.xml files, too, and LibreOffice uses a zero file to store default dialog language. See the following files in the attached test document:

Object 1/Configurations2/accelerator/current.xml     
Dialogs/Standard/DialogStrings_en_US.default
Comment 12 László Németh 2014-03-17 09:25:46 UTC
Created attachment 95917 [details]
test file saved in LibreOffice master with extra zero byte files

check with unzip -v
Comment 13 László Németh 2014-03-17 09:40:19 UTC
Created attachment 95918 [details]
Bash script to test odts for zero byte files

This is for a quick test using locate to test all ODTs of the system. The script lists the invalid test files of xmlsecurity and qa, too:

 1 /home/laci/git/libo/sfx2/qa/complex/sfx2/testdocuments/CUSTOM.odt: mimetype     
      1 /home/laci/git/libo/xmlsecurity/test_docs/documents/invalid_ooo2_x_doc1.odt: Pictures/New Text Document.txt   
      1 /home/laci/git/libo/xmlsecurity/test_docs/documents/invalid_ooo2_x_macro1.odt: Basic/New Text Document.txt   
      1 /home/laci/git/libo/xmlsecurity/test_docs/documents/invalid_ooo2_x_macro2.odt: Basic/Standard/New Text Document.txt   
      1 /home/laci/git/libo/xmlsecurity/test_docs/documents/invalid_ooo2_x_macro3.odt: Basic/New Folder/New Text Document.txt  
      1 /home/laci/git/libo/xmlsecurity/test_docs/documents/invalid_ooo2_x_macro4.odt: Basic/Standard/New Folder/New Text Document.txt  
      1 /home/laci/git/libo/xmlsecurity/test_docs/documents/invalid_ooo3_2_doc1.odt: META-INF/New Text Document.txt   
      1 /home/laci/git/libo/xmlsecurity/test_docs/documents/invalid_ooo_3_2_doc2.odt: New Text Document.txt   
      1
Comment 14 Robinson Tryon (qubit) 2015-12-14 05:01:17 UTC Comment hidden (obsolete)
Comment 15 Robinson Tryon (qubit) 2016-02-18 14:51:35 UTC Comment hidden (obsolete)
Comment 16 Hossein 2022-11-24 16:31:14 UTC
Re-evaluating the EasyHack in 2022

This issue is still relevant. I have created a blank file with the latest LO 7.5 dev master, and the empty directories are still there.

As a fix was merged before, and caused a regression (see comment 8), extensive tests should be done to make sure that we do not break things including the backward compatibility.

Version: 7.5.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 4dc372c87b194c32597eda6a7f99ddcee3108fa6
CPU threads: 8; OS: Linux 5.15; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded

$ unzip -l ~/blank.odt 
Archive:  /home/hossein/blank.odt
  Length      Date    Time    Name
---------  ---------- -----   ----
       39  2022-11-24 16:13   mimetype
        0  2022-11-24 16:13   Configurations2/accelerator/
        0  2022-11-24 16:13   Configurations2/images/Bitmaps/
        0  2022-11-24 16:13   Configurations2/toolpanel/
        0  2022-11-24 16:13   Configurations2/progressbar/
        0  2022-11-24 16:13   Configurations2/floater/
        0  2022-11-24 16:13   Configurations2/statusbar/
        0  2022-11-24 16:13   Configurations2/toolbar/
        0  2022-11-24 16:13   Configurations2/popupmenu/
        0  2022-11-24 16:13   Configurations2/menubar/
      899  2022-11-24 16:13   manifest.rdf
      865  2022-11-24 16:13   meta.xml
     3419  2022-11-24 16:13   content.xml
    13400  2022-11-24 16:13   settings.xml
    12121  2022-11-24 16:13   styles.xml
      297  2022-11-24 16:13   Thumbnails/thumbnail.png
     1061  2022-11-24 16:13   META-INF/manifest.xml
---------                     -------
    32101                     17 files
Comment 17 David Hashe 2025-03-26 22:40:04 UTC
I'm interested in working on this, so I have assigned it to myself. As a first step, I wrote a regression test that would have caught the regression that occurred: https://gerrit.libreoffice.org/c/core/+/183356

I would appreciate advice on any other tests I should do or write.
Comment 18 Michael Meeks 2025-03-27 09:14:38 UTC
I love the approach of first writing the regression tests - thanks David ! =)
Comment 19 David Hashe 2025-04-25 15:28:53 UTC
Apologies for the delay on this.

So, thinking about this issue some more, there are four sub-problems to be solved:

1. Remove the subdirectories created directly by UIConfigurationManager
   - Configurations2/popupmenu
   - Configurations2/toolpanel/
   - Configurations2/statusbar/
   - Configurations2/progressbar/
   - Configurations2/toolbar/
   - Configurations2/menubar/
   - Configurations2/floater/
2. Remove the subdirectories created by ImageManager:
   - Configurations2/images/Bitmaps/
3. Remove the subdirectories created by DocumentAcceleratorConfiguration:
   - Configurations2/accelerator/
4. Remove the base directory:
   - Configurations2

(1), (2), and (3) are all doable. I don't think that (4) is doable.

(4) is not doable because UIConfigurationManager is passed an XStorage that represents Configurations2/ through its setStorage method, which corresponds to the setStorage method of the XUIConfigurationStorage interface. This XStorage is already opened to the Configurations2/ path as readwrite, so the Configurations2/ directory is already created before the UIConfigurationManager is instantiated.

We really need the UIConfigurationManager to open the Configurations2/ XStorage itself, because it needs to be able to open it first as readonly and then later re-open it as readwrite if it needs to be written.

So, in order to fix (4), we would need to change the behavior of the XUIConfigurationManager2 interface.

Currently, the XUIConfigurationManager2 object directly uses the XStorage that was passed to it, so the caller should pass in an XStorage representing the Configurations2/ path. We would need the XUIConfigurationManager2 object to open the Configurations2/ path on the XStorage that was passed to it, so the caller would pass in an XStorage representing the root path.

This is an externally visible change in the behavior of a UNO interface, so it would have backwards compatibility implications. I don't know if, in practice, anyone has ever written an extension or other program using the UNO API that manually instantiates a UIConfigurationManager, but changing the behavior of the interface would break any such programs.

So, I do not think that (4) is doable.

Thankfully, (1), (2), and (3) are all doable because they all involve XStorages that are opened within classes that we control. We don't break any behavior by opening them first as readonly and then re-opening them as readwrite the first time that we need to store.

My plan is to create three separate patches, one each for (1), (2), and (3). I already submitted a test for (1), but I will also create tests for (2) and (3) as another patch. They involve different classes and different code changes, so different tests are warranted.
Comment 20 David Hashe 2025-04-26 00:41:39 UTC
The patch providing tests for (2) and (3): https://gerrit.libreoffice.org/c/core/+/184646

The patch implementing (1): https://gerrit.libreoffice.org/c/core/+/184655
Comment 21 Commit Notification 2025-04-29 06:20:14 UTC
David Hashe committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/b9054fed3724a19479e221a8b4818a127a18aabe

tdf#60700 Unit tests for saving image icons and accelerators

It will be available in 25.8.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.