Created attachment 70866 [details]
Empty .docx file incorrectly ending with .doc
It happens that .docx files are using a .doc extension. I found this for several files on a website, even though I don't understand why. In that case LO 188.8.131.52 and 184.108.40.206 (tested on Linux for the first, on Mac OS X for the second) consider the file as damaged, ask whether to attempt to fix it, and eventually fail to open it. Users can then only blame LO, because it appears to work with MS Word.
Would it be possible to do a basic detection during the attempt at fixing the file, or even before? Document formats have magic sequences that allow the 'file' UNIX command to detect them. This would make LO much more robust. Many applications do not even use the extension for anything, and detect the format from the contents of the file.
Attached is an empty .docx file incorrectly ending with .doc to illustrate the problem.
*** Bug 60928 has been marked as a duplicate of this bug. ***
I've identified at least two separate sources of .docx incorrectly named .doc, plus one in the duplicate report. I think this is a common mistake that will prompt LO users to think it does not handle some Word documents at all.
*** This bug has been marked as a duplicate of bug 54949 ***
I think there is more to it than what bug 54949 is about. Even if I rename the file in question to have extension docx the output is not correct with LO. If you reopen this bug report I can attach a pdf file showing how the file actually is supposed to look like according to MS Office.
*** This bug has been marked as a duplicate of bug 59426 ***