Created attachment 124848 [details] Corrupted file I help sometimes people on a French forum (I'm French), and I see often people speak about corrupted files. The content.xml are corrupted by unknown issue. The only way to fix that is like that: First, let's call our nonopening ODT file as "bad.odt". 1- make backup FIRST -> "$ cp bad.odt bad_original.odt" 2- make new directory-> "$ mkdir repair" 3- copy bad.odt to repair directorty "$ cp bad.odt repair" 4- change default directory to repair -> "$ cd repair" 5- unzip bad.odt -> "$ unzip bad.odt" 6- after unzipping you get bunch of files and directory's under repair, find content.xml and open it whit your favorite text editor -> "$ kate content.xml" 7- use "find" function to find out, if you have XML tag "<office:automatic-styles>" (somewhere at the beginning of document) and XML tag "</office:automatic-styles>" (somewhere, middle of document). If you have, then delete them and all data between them. Be sure, that you don't delete more or less! 8- save content.xml (keep original name and place!) 9- zip extracted data back to one ODT document -> "$ zip -r ./bad_repaired.odt ./*" 10- try to open repaired document -> "$ ooffice ./bad_repaired.odt" Solution find here: https://forum.openoffice.org/en/forum/viewtopic.php?t=1532 There is a possibility, when the soft detect an error in the content.xml to propose to fix it like that by itself ? I put a document corrupted in attachment.
Needs to be proven this is an effective and useful utility, but if so seems reasonable. Can imagine a number of "repair" utilities for correcting corrupt ODF that could be provided from the GUI Tools -> ODF -> Repairs Tools -> ODF -> Validation Tools -> ODF -> Conversion
I don't ask for utilities but only an proposition when the error message said that because this message is only for this problem not for other corruption. The error message is: Read-Error. Format error discovered in the file in sub-document content.xml at 2,78898(row,col). The last line for row and column is different between a file and another, but the error is always fix with the above solution, the only thing I propose is to put the proposition to repair in the box message.
(In reply to shunesburg69 from comment #0) > 7- use "find" function to find out, if you have XML tag > "<office:automatic-styles>" (somewhere at the beginning of document) and XML > tag "</office:automatic-styles>" (somewhere, middle of document). If you > have, then delete them and all data between them. Be sure, that you don't > delete more or less! > There is a possibility, when the soft detect an error in the content.xml to > propose to fix it like that by itself ? But by removing styles completely you will lose all formatting! Why not just remove the duplicate attributes? You can even do that automatically with tools like tidy. In addition - this only works if the problem is in styles section, but what if there is some problem in the document contents? So this is a bad idea in general, and I don't think that we should suggest such wrong things to users. > I put a document corrupted in attachment. BTW the corruption in this document is the one that was fixed in Bug 96147.
No, the formatting stay here just the automatic styles are removed, but the major part of files don't change at all after the repair process.
Created attachment 124912 [details] repaired with tidy (In reply to shunesburg69 from comment #4) > No, the formatting stay here just the automatic styles are removed, But most of the formatting is stored in the automatic styles. I'm attaching a file repaired with tidy (by simply running "tidy -m -xml content.xml") - compare it with the same file repaired with your method. It's hard to not notice the difference... > but the > major part of files don't change at all after the repair process. And yet - why _remove_ valuable data, when it can be easily repaired?
(In reply to Maxim Monastirsky from comment #5) > And yet - why _remove_ valuable data, when it can be easily repaired? I just propose what I try but if you have a better way, I'm ok.
Maxim's proposal solves issues more cautious. But whether to tidy or to delete the question boils down to an integrated repair function. And we better ensure that those trouble not happens - until then it's better realized per extension.