Currently all empty (0-byte) files are detected as HTML/Web files (this is actually a change in 6.2.0.3 since 6.1.0.3, until then they were detected as text files). Since there's no other identifying information, they should be detected based on extension. Current behavior is a problem, because in Windows right click -> New -> <various MS Office formats> tend to create 0-byte files with MS Office installed, and opening and editing them in LibreOffice can cause confusion and potential data loss if the user doesn't notice the wrong format before saving their document.
Well - then not only detection should be changed, but also something needs to be done with document initialization as well - because *reading* from such a file using detected filter would be impossible.
See also: https://ask.libreoffice.org/en/question/179707 https://ask.libreoffice.org/en/question/178368
Regarding the initialization: should something specific be done with the new document depending on the filter? E.g., for a 0-byte .docx, should we simply create a new Writer document (using default template) and set its filter to DOCX, or should we initialize it as if that is a DOCX - which would mean different default fonts, compatibility options, etc. (whatever is done in DOCX importer when a valid DOCX is imported, before actual DOCX data is read)? Should all filters be modified to be able to do that then? Would that require to have own default templates for all filters, of should the one default template for the module be used anyway, with application of filter-specific modifications (with a risk of the resulting new document to differ from the template as used in normal new document creation)?
(In reply to Mike Kaganski from comment #1) > Well - then not only detection should be changed, but also something needs > to be done with document initialization as well - because *reading* from > such a file using detected filter would be impossible. Sure, the point is not to read from an empty file, but to correctly initialize one.
(In reply to Mike Kaganski from comment #3) > Regarding the initialization: should something specific be done with the new > document depending on the filter? E.g., for a 0-byte .docx, should we simply > create a new Writer document (using default template) and set its filter to > DOCX, or should we initialize it as if that is a DOCX - which would mean > different default fonts, compatibility options, etc. (whatever is done in > DOCX importer when a valid DOCX is imported, before actual DOCX data is > read)? Should all filters be modified to be able to do that then? Would that > require to have own default templates for all filters, of should the one > default template for the module be used anyway, with application of > filter-specific modifications (with a risk of the resulting new document to > differ from the template as used in normal new document creation)? All very good questions, I'd say just create an empty document/spreadsheet/presentation, set the export type to the identified one, and do an export+import cycle. Out of that the last step is optional if it'd be a larger task, the most important is to start in the correct application and set the correct save format. While I think the above method would be applicable to most formats, it's really relevant for the formats that could come in as 0-byte files in real life, ie. the ones that can be created by MS Office via Explorer context menu.
I plan to look at this.
Miklos Vajna committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/ada07f303e7cd1e39c73abe0741aefe7d9d73a57 tdf#123476 filter: try to detect 0-byte files based on extension It will be available in 7.1.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Mike Kaganski committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/2854362f429e476d4a1ab4759c6a1f1c04150280 tdf#123476 filter: Also handle empty ODF It will be available in 7.2.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Mike Kaganski committed a patch related to this issue. It has been pushed to "libreoffice-7-1": https://git.libreoffice.org/core/commit/e3307e5e76d5c35ee79b262d519c4a777acce536 tdf#123476 filter: Also handle empty ODF It will be available in 7.1.1. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
*** Bug 90613 has been marked as a duplicate of this bug. ***
*** Bug 98127 has been marked as a duplicate of this bug. ***
*** Bug 104819 has been marked as a duplicate of this bug. ***
*** Bug 120822 has been marked as a duplicate of this bug. ***
*** Bug 133164 has been marked as a duplicate of this bug. ***
The comment in this bug report on 2021-01-29 12:29:47 UTC said regarding the patch: "It will be available in 7.1.1." I was the original reporter of Bug 104819 back in LibreOffice version 5.2.3.3 on 20/Dec/2016. That bug report was resolved as a duplicate which leads to this bug report. After the posted update comments in this bug report about the fix being available in 7.1.1, I have tested using version 7.1.2.2 on Windows 10 (Version 20H2 OS build 19042.928) and unfortunately the patch does fully resolve the issues I reported. The steps I followed were: 1. Create a new Microsoft Word Doc (.docx) in a folder (using the right mouse button option) on Windows 10 with Microsoft Word from Microsoft Office Professional Plus 2013 (15.0.4875.1001) 2. Add some header and footer text 3. Add some main document text using format styles, Title, Heading 1, Heading 2, Text Body etc 4. Save document 5. Quit LibreOffice Writer 6. Reopen document in LibreOffice Writer The reopened document: a) There was no header or footer text. b) All of the main document text formatted using format styles was no longer formatted. The main document text was all present but it was all "Preformatted Text" c) If I try to open the saved document using Microsoft Word it says that the document cannot be opened as it is corrupt.
Works fine here in 7.1.2.2 when checked with an empty file with .docx extension.
Aron, Thank-you for checking and getting back so quickly. I am currently at a loss to explain the different behaviour you and I are seeing. I have checked the behaviour I saw yesterday and it is 100% repeatable for me. I have also checked I get the same behaviour using a different account on the same Windows 10 PC. When I reported the original bug one issue that became apparent explained why the file was "silently" saving and creating data loss for me, and for others they were being prompted that there was an "potential issue". This is the "Ask when not saving in ODF or default format" LibreOffice option. I had turned this option off. I turned this off by using the option to do so on the "Confirm File Format" prompt window the first time it was presented to me. I put this option back on today and repeated the experiment using a newly (right-mouse button click) .docx document. The was a 0 byte sized document. Now when I saved the document after edits I got the "Confirm File Format" prompt window and the only two document format options it offered me were "Use text Format" and "Use ODF Format". For me I still have to use "save as" to save in Microsoft Word .docx format to avoid data loss the first time I save a .docx document. I also checked the behaviour by manually creating a 0 byte .doc document. The behaviour was the same for me as with a new .docx document, data loss on saving. Thank-you for your work in looking at this issue. However, for me at the moment the issue is not resolved. If you would like me to try any other experiements with the version of LibreOffice I have installed please let me know. I have double checked the version of LibreOffice from the help menu. The details are: Version: 7.1.2.2 (x64) / LibreOffice Community Build: 8a45595d069ef5570103caea1b71cc9d82b2aae4 https://git.libreoffice.org/core/+log/8a45595d069ef5570103caea1b71cc9d82b2aae4 Environment: CPU threads: 4; OS: Windows 10.0 Build 19042 User Interface: UI render: Skia/Raster; VCL: win Locale: en-GB (en_GB); UI: en-GB Misc: Calc: threaded
This bug was fixed about half a year ago. It has a cppunit test that ensures it remains fixed. If you have a related problem, could you please open a follow-up bug instead? Thanks.
(In reply to Miklos Vajna from comment #18) > This bug was fixed about half a year ago. It has a cppunit test that ensures > it remains fixed. If you have a related problem, could you please open a > follow-up bug instead? Thanks. I can confirm what junk_2010@live.co.uk has written, the bug WAS NOT fixed and was/is still present (LO 37.0,7.1). I reported a duplicate of this bug (Bug 90613), so I permanently check if it works. There was no working fix in any version of Windows/LO I had/have. Actually Win 20H2+LO: Version: 7.1.2.2 (x64) / LibreOffice Community Build ID: 8a45595d069ef5570103caea1b71cc9d82b2aae4 CPU threads: 16; OS: Windows 10.0 Build 19042; UI render: Skia/Raster; VCL: win Locale: sk-SK (sk_SK); UI: en-GB Calc: CL LO still opens a 0-bit unformatted file and if you does not notice it, you will loose all formats after you savethe document, as there is still no warning of saving such a file as simple .txt format. This Bug should be REOPEN as it was fixed neither in 7.0 nor in 7.1.
(In reply to junk_2010 from comment #15) (In reply to Orwel from comment #19) junk_2010, Orwel: could you please check if that doesn't work with clean user profile? Alco could you please record a screen cast, where you create such document in Explorer, open it in Writer, show Help->About, then save, showing the filter name that is displayed in the warning about file format? Thanks!
Hi, I have tested with clear U-profile. I notice a pop up window Confirm file format ("This document may contain formatting or content that cannot be saved in the currently selected file format “Text”...". In my profile this option (can be found in Options-Load/Save-General - Warn when not saving in ODF or default format) is deactivated - if I check this option in my U-profile, I get the same pop up window. But this is not a solution for described bug, indeed: 1. For people who use a lot of .docx, .doc, .odt files, this pop up dialog is very annoying as it comes with every single .docx/.doc file you want to save/re-save (Save/Save as). This means, each .docx file opened and saved/re-saved is showing this popup window. So for each save you have to make 2 steps, if want to keep the docx format (click save, then click keep format). Therefor I have deactivated it. 2. The pop up window comes (by checked Warn dialog in Load/Save-General) only with SAVE function. By SAVE AS, you only get the Save as window, where you can see the proposed extension as .txt. But the BUG itself is that a clear .docx file SHOULD NOT be interpreted as .txt file in any way. If you open a clear .docx (created in Win Explorer by right click) in MS Office, you get the default template opened. The same, if you open a clear .odt file (created in Win Explorer by right click), the created .odt file is not a TXT file... So why does LO interpret a clear .docx file as txt file? The popup window is not a solution, because I need it to be deactivate (see point 1 above). This is the bug we speak about... (In reply to Mike Kaganski from comment #20) > > > could you please record a screen cast Do you mean a screen video record of proposed steps? If yes, do you still need it?
(In reply to Orwel from comment #21) > (In reply to Mike Kaganski from comment #20) > > > > > > could you please record a screen cast > > Do you mean a screen video record of proposed steps? Yes > If yes, do you still need it? Yes
Created attachment 171605 [details] Screencast Actually, I was able to repro myself. And I agree with "reopened" state, since it was never fixed. The problem here is using *Writer* to open the file. I always had Word associated with .DOCX on my system, and always tested with "Open With->LibreOffice", and that works as intended. But if I use "Open With->LibreOffice Writer", or associate DOCX with Writer (as opposed to simple LibreOffice), the problem appears. Given that in normal installation, where user chooses to associate MSO files with LibreOffice, DOCX are associated with Writer, this problem is indeed still not fixed for users. The problem is likely the '--writer' command line option used in this case.
The bug was created by Aron, the original scope was Online. The fix works for Online, as far as I know. If you want to have this working in a wider scope, that's fine, but please let's have a separate, follow-up bug for that. Thanks.
I was about to create a screencast but it seems it is no longer needed. I would add that I believe this issue occurs with both new 0 byte .docx and .doc files, though I appreciate a .doc file is no longer very common. > The bug was created by Aron, the original scope was Online. The fix works for > Online, as far as I know. If you want to have this working in a wider scope, > that's fine, but please let's have a separate, follow-up bug for that. Thanks I would suggest that if there is no wish to re-open this bug report, a new bug report is not required as you could just re-open one of the bug reports that was closed as a duplicate that of this report. All of the reports below appear to me to describe the issue: Bug 90613 2015-04-14 Bug 98127 2016-02-24 Bug 104819 2016-12-20 Bug 120822 2018-10-23
Mike Kaganski committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/dff586735b6618d9b011823594a33287d8f7f223 tdf#123476: also use filter by extension when its service is the same It will be available in 7.2.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Mike Kaganski committed a patch related to this issue. It has been pushed to "libreoffice-7-1": https://git.libreoffice.org/core/commit/a8e84a2d6e634c03d62e17bcc1b617238dcc9eb1 tdf#123476: also use filter by extension when its service is the same It will be available in 7.1.4. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Thank-you for putting a fix in place for this. I have downloaded and installed the 7.2.0.0.alpha1+(x64) daily build from: https://dev-builds.libreoffice.org/daily/master/current.html 2021-05-16 04:50:54 Version: 7.2.0.0.alpha1+ (x64) / LibreOffice Community Build ID: 58b0c95ad50139a62bddb348d10f94053c09cd5b CPU threads: 4; OS: Windows 10.0 Build 19042; UI render: Skia/Raster; VCL: win Locale: en-GB (en_GB); UI: en-GB Calc: CL With this build I can confirm that the issue I reported has been resolved. Specifically the steps I followed on Windows 10 was: 1. Create a new Microsoft Word Doc (.docx) in a folder (using the right mouse button option) on Windows 10 with Microsoft Word from Microsoft Office Professional Plus 2013 (15.0.4875.1001) 2. Add some header and footer text 3. Add some main document text using format styles, Title, Heading 1, Heading 2, Text Body etc 4. Save document 5. Quit LibreOffice Writer 6. Reopen document in LibreOffice Writer The re-opened document has now retained all of the formatting. I was also able to open the document in Microsoft Word without any issues.
Based on comment 28 (from the original reporter of the bug) I will mark this bug as verified.