While creating my own flat spreadsheet file, based on LibreOffice and Google Spreadsheet saved files, I realised it would save a lot of bytes to add a default namespace to the <table:table> element, and remove all the table: prefixes from this element and it's children. i.e. go from this: <table:table table:name="" table:style-name=""> <table:table-column table:style-name="" table:default-cell-style-name="" table:number-rows-repeated="" table:number-columns-repeated=""> <table:table-row table:style-name=""> <table:table-cell table:style-name=""/> </table:table-row> </table:table> to this: <table name="" style-name="" xmlns="urn:oasis:names:tc:opendocument:xmlns:table:1.0"> <table-column style-name="" default-cell-style-name="" number-rows-repeated="" number-columns-repeated=""> <table-row style-name=""> <table-cell style-name=""/> </table-row> </table> Such optimisations are surely also available to other document types, not just spreadsheets. I also set the style namespace as the default namespace on the office:styles, office:automatic-styles and office-master-styles elements, followed by removing style: prefixes from element names and attributes. I brought down a simple (14 cols, 15 rows) 29KB file to 24KB. Since I will be generating and serving these over the internet for our business application, this 17% reduction (before compression) is fairly significant. For bonus credit, since the office: and text: namespaces are necessary on all and most table cells respectively, reducing the namespace identifiers to "o" and "t" would further bring down the file size.
Not such a good idea after all. Microsoft Excel cannot read documents with default namespaces (and maybe even documents where the namespace is not the normal string). I suspect they are not using a real XML parser :-( The error "Excel found unreadable content in 'filename'. Do you want to recover the contents of this workbook?" is shown. Accepting the offer to recover the document's contents strips out all formatting and styles. I think Excel compatability should be a higher priority than file size. Perhaps there could be two FILESAVE codepaths, a 'pure' one for adhering to the specs and writing sexy XML, and another for emitting 'MS Office compatable' OpenDocument files (bug #53998 filed requesting such). This feature request would then apply to the sexy code path only.
In order to limit the confusion between ProposedEasyHack and EasyHack and to make queries much easier we are changing ProposedEasyHack to NeedsDevEval. Thank you and apologies for the noise
(In reply to Nicholas Shanks from comment #1) > Microsoft Excel cannot read documents with default namespaces (and maybe > even documents where the namespace is not the normal string). I suspect they > are not using a real XML parser :-( Well that's unfortunate! > I think Excel compatability should be a higher priority than file size. > True, but I think the devs are going to be reticent to maintain "two FILESAVE codepaths". Some alternative ideas: 1) Perhaps the most recent version of MS-Office uses a real XML parser. Knock on wood! 2) Microsoft attends ODF Plugfests (e.g. http://plugfest.opendocumentformat.org/2014-london/), so concerns such as this one could definitely be raised. Let's test again w/MS-Office, and see where we stand. In any case, it's a neat idea for a space-saving enhancement. Let's change Status -> NEW
(In reply to Nicholas Shanks from comment #1) > Microsoft Excel cannot read documents with default namespaces (and maybe > even documents where the namespace is not the normal string). I suspect they > are not using a real XML parser :-( Could you please upload a couple sample files? It would be great to have the "original" file (without default namespace) alongside the file that uses a default namespace, so we can check them side-by-side. Thanks!
Migrating Whiteboard tags to Keywords: (needsDevEval difficultyBeginner) [NinjaEdit]
Sorry for the delay on this one — I left the relevant company, no longer need to use/output OpenOffice files, and have been ignoring related emails. If you want to send me any current XML file I'll perform some manual optimisation of the XML and send it back. It's nothing that couldn't be done by any XML-savvy coder though.