Bug 57769 - FILEOPEN: .docx file considered as damaged if using .doc extension
Summary: FILEOPEN: .docx file considered as damaged if using .doc extension
Status: RESOLVED DUPLICATE of bug 59426
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
(earliest affected) release
Hardware: Other All
: medium normal
Assignee: Not Assigned
: 60928 (view as bug list)
Depends on:
Reported: 2012-12-01 12:15 UTC by Milan Bouchet-Valat
Modified: 2013-11-16 19:03 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:

Empty .docx file incorrectly ending with .doc (6.11 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2012-12-01 12:15 UTC, Milan Bouchet-Valat

Note You need to log in before you can comment on or make changes to this bug.
Description Milan Bouchet-Valat 2012-12-01 12:15:40 UTC
Created attachment 70866 [details]
Empty .docx file incorrectly ending with .doc

It happens that .docx files are using a .doc extension. I found this for several files on a website, even though I don't understand why. In that case LO and (tested on Linux for the first, on Mac OS X for the second) consider the file as damaged, ask whether to attempt to fix it, and eventually fail to open it. Users can then only blame LO, because it appears to work with MS Word.

Would it be possible to do a basic detection during the attempt at fixing the file, or even before? Document formats have magic sequences that allow the 'file' UNIX command to detect them. This would make LO much more robust. Many applications do not even use the extension for anything, and detect the format from the contents of the file.

Attached is an empty .docx file incorrectly ending with .doc to illustrate the problem.
Comment 1 Urmas 2013-02-16 09:16:22 UTC
*** Bug 60928 has been marked as a duplicate of this bug. ***
Comment 2 Milan Bouchet-Valat 2013-02-16 12:02:45 UTC
I've identified at least two separate sources of .docx incorrectly named .doc, plus one in the duplicate report. I think this is a common mistake that will prompt LO users to think it does not handle some Word documents at all.
Comment 3 Andras Timar 2013-03-19 20:55:01 UTC

*** This bug has been marked as a duplicate of bug 54949 ***
Comment 4 Erik P. Olsen 2013-03-19 22:14:26 UTC
I think there is more to it than what bug 54949 is about. Even if I rename the file in question to have extension docx the output is not correct with LO. If you reopen this bug report I can attach a pdf file showing how the file actually is supposed to look like according to MS Office.
Comment 5 Andras Timar 2013-03-20 11:41:00 UTC

*** This bug has been marked as a duplicate of bug 59426 ***