Bug 120822 - DOCX files created as New Microsoft Word Document in Windows Explorer context menu are corrupted if first-time saved with Libre
Summary: DOCX files created as New Microsoft Word Document in Windows Explorer context...
Status: RESOLVED DUPLICATE of bug 123476
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
6.0.2.1 release
Hardware: All Windows (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-10-23 08:30 UTC by Mark
Modified: 2021-04-27 06:06 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
New .docx created through Windows context menu, edited and then saved from Libre (15 bytes, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2018-10-23 08:32 UTC, Mark
Details
Image of the document I created in Libre and saved (as the attached .docx) (22.82 KB, image/png)
2018-10-23 08:33 UTC, Mark
Details
Zero-byte file created from Windows Explorer context menu (zipped because you can't upload empty files!) (126 bytes, application/zip)
2018-10-23 08:34 UTC, Mark
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mark 2018-10-23 08:30:26 UTC
Description:
If you create a new DOCX file using the New Microsoft Word Document context menu option in Windows Explorer (it will be zero bytes long) and then open it in Libre Office and edit/create a document and save, when subsequently opened, ALL you will have is a completely unformatted ASCII file - all other information will be lost, UTF-8 characters will be displayed as ???. This file will not even open in MS Word.

I realise this is not the best workflow, but it can result in considerable loss of data if somebody (like me) did it inadvertently, because a client insisted on use of DOCX etc.

Steps to Reproduce:
1. Right-click in Windows Explorer and select New -> Microsoft Word Document (.docx)
2. Open this file in Libre Office.
3. Create a document with formatting of any kind (e.g. insert a text-box, use different font sizes/fonts etc.)
4. Save with CTRL+S
5. Close the document and re-open in Libre
6. Try opening in MS Word

Actual Results:
File reopens in Libre with the appearance of a plain text file, no non-ASCII characters, all formatting and other features stripped out.

File reported as corrupted by MS Word and cannot be recovered.

Expected Results:
File should have opened in Libre and MS Word with all formatting and features intact.


Reproducible: Always


User Profile Reset: Yes


OpenGL enabled: Yes

Additional Info:
[Information automatically included from LibreOffice]
Locale: en-GB
Module: TextDocument
[Information guessed from browser]
OS: Windows (All)
OS is 64bit: no

Version: 6.0.2.1 (x64)
Build ID: f7f06a8f319e4b62f9bc5095aa112a65d2f3ac89
CPU threads: 4; OS: Windows 10.0; UI render: GL; 
Locale: en-GB (en_GB); Calc: CL
Comment 1 Mark 2018-10-23 08:32:54 UTC
Created attachment 145919 [details]
New .docx created through Windows context menu, edited and then saved from Libre
Comment 2 Mark 2018-10-23 08:33:38 UTC
Created attachment 145920 [details]
Image of the document I created in Libre and saved (as the attached .docx)
Comment 3 Mark 2018-10-23 08:34:55 UTC
Created attachment 145921 [details]
Zero-byte file created from Windows Explorer context menu (zipped because you can't upload empty files!)
Comment 4 Mike Kaganski 2018-10-23 08:47:50 UTC
This is not a bug.

A file created by the "New Microsoft Word Document" context menu option in Windows Explorer is an invalid DOCX. It isn't a zip with a proper directory structure and XMLs, it's just an empty file. The context menu option is presumably added by MS Office (at least it's absent on a newly-installed Win8.1, and exists on my Win10 with MS Office installed).

Opening any file in LibreOffice involves detection of its format, not only by checking its file extension (which is often misleading), but also by checking its internal structure. For the empty "DOCX" in this case, the only possible detection is plain text, which is naturally remembered and used later on save. And when you save the file first time, you are presented with a warning message stating the detected format:

> ====
> Confirm File Format
> ====
> This document may contain formatting or content that cannot be saved
> in the currently selected file format “Text”.
> 
> Use the default ODF file format to be sure that the document is saved
> correctly.
> 
> [x] Ask when not saving in ODF or default format
> 
>       [Use Text format]  [Use ODF Format]
> ====

Note that "Use Text format", which is different from what you see when the file is indeed a proper DOCX, or you choose DOCX manually:

      [Use Word 2007-2019 format]  [Use ODF Format]

So, while it's a pity that MS Office uses this strange technique to create invalid files instead of creating proper blank documents (compare to what happens when you choose to create a new ODT), it's not something that LibreOffice should change, or else it would result in massive regressions in other areas. Just pay attention to the detected format, and press Esc in the warning dialog to choose the desired output format.
Comment 5 Mark 2018-10-23 09:10:11 UTC
Aha, I understand - I believe that the Confirm File Format warning is something you can opt not to display so I probably dismissed it a long time ago. I always understood it to primarily refer to Libre's inability to fully comply with the OOXML format, which is quite understandable and something I could live with (assuming it is talking about a few glitches here and there), which is why I dismissed that warning. It never occurred to me it wasn't creating the full DOCX zip file at all.

I wonder if that is something that could be subject to a different warning like:

WARNING - your file contains formatting but is being saved as a PLAIN TEXT FILE and all formatting WILL be lost!

I think I might sit up and take notice of that!

I get that it really shouldn't be a priority for Libre to work around Office idiosyncrasies...
Comment 6 Mike Kaganski 2018-10-23 09:19:26 UTC
(In reply to Mark from comment #5)
> I believe that the Confirm File Format warning is
> something you can opt not to display so I probably dismissed it a long time
> ago.

Options→Load/Save→General: Warn when not saving in ODF or default format

> I wonder if that is something that could be subject to a different warning
> like:
> 
> WARNING - your file contains formatting but is being saved as a PLAIN TEXT
> FILE and all formatting WILL be lost!

Well - the message in about like that:

"This document may contain formatting or content that cannot be saved in the currently selected file format “Text”."
Comment 7 Mark 2018-10-23 09:23:41 UTC
Oh, you're right, thanks - I never noticed that part because that wasn't the typical reason the warning popped up (but rather because I was trying to save .odt as .docx), so I dismissed the warning. Not sure I would have noticed the part where it states the file type but I guess that's my lookout. Have turned it back on now, though probably won't need it ever again.

Case closed.
Comment 8 Mike Kaganski 2020-05-19 11:34:18 UTC
*** Bug 133164 has been marked as a duplicate of this bug. ***
Comment 9 Mike Kaganski 2021-04-27 06:06:45 UTC

*** This bug has been marked as a duplicate of bug 123476 ***