Currently (since 5.0.4), when LO opens an XML format (ODF, OOXML) file with errors (like duplicated attributes, data-past-body etc.) an error "SAXParseException: ..." is returned to user, and the file isn't opened.
This is the result of better error detection/handling introduced by commit ebf767eeb2a169ba533e1b2ffccf16f41d95df35, and allowed us to detect and fix quite a number of errors. However, it is really a problem for end users being unable to open the corrupted files that were possible to be open previously. This leads, e.g., to creations of HOWTOs in Ask Libreoffice, that describe using AOO as correct way to open those files.
I suggest changing current operation logic to be like that:
If SAXParseException was generated during XML parsing, then display something like this to user:
"This file is corrupted (<Here goes current SAXParseException message>). LibreOffice may try to recover as much as possible, but be prepared that some information can be damaged or lost. Do you want to proceed?"
If user says "yes", then parsing continues as it worked before ebf767eeb2a169ba533e1b2ffccf16f41d95df35. Following SAXParseException are handled as if used answered "yes" each time.
If an unrecoverable exception is encountered, then, of course, return message "File cannot be recovered".
I suppose that the nature of the message is clear enough, and users will continue to file reports about such problems (esp. if the problematic file was previously generated using LO).
I've seen more reports like that, so it would be nice if LO atleast opens the file and recovers as much as possible.
A patch is submitted to gerrit: https://gerrit.libreoffice.org/33181
Mike Kaganski committed a patch related to this issue.
It has been pushed to "master":
tdf#104718: Prompt user to continue on SAXException
It will be available in 5.4.0.
The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
Affected users are encouraged to test the fix and report feedback.