Bug 125216 - FILEOPEN DOC file with extension .dot gives error "Read Error - This is not a valid WinWord6 File"
Summary: FILEOPEN DOC file with extension .dot gives error "Read Error - This is not a...
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
6.2.2.2 release
Hardware: All All
: medium normal
Assignee: Justin L
URL:
Whiteboard: target:7.5.0 target:7.4.3 inReleaseNo...
Keywords: bibisected, bisected, regression
: 130308 (view as bug list)
Depends on:
Blocks: DOC-Opening
  Show dependency treegraph
 
Reported: 2019-05-11 17:12 UTC by Roman Kuznetsov
Modified: 2022-12-08 14:16 UTC (History)
11 users (show)

See Also:
Crash report or crash signature:


Attachments
Problem DOT file (26.50 KB, application/msword)
2019-05-11 17:13 UTC, Roman Kuznetsov
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Roman Kuznetsov 2019-05-11 17:12:41 UTC
Description:
can't open DOT file

This bug follows from bug 125184

Steps to Reproduce:
1. Try open DOT file from attach
2. File doesn't open with error "File isn't file Winword6"
3.

Actual Results:
File doesn't open

Expected Results:
File opens fine


Reproducible: Always


User Profile Reset: No



Additional Info:
Comment 1 Roman Kuznetsov 2019-05-11 17:13:15 UTC
Created attachment 151311 [details]
Problem DOT file

Версия: 6.2.2.1 (x64)
ID сборки: fcd633fb1bf21b0a99c9acb3ad6e526437947b01
Потоков ЦП: 4; ОС:Windows 10.0; Отрисовка ИП: GL; VCL: win; 
Локаль: ru-RU (ru_RU); Язык UI: ru-RU
Calc: threaded
Comment 2 Roman Kuznetsov 2019-05-11 17:14:59 UTC
file opens in 6.1.4 (with another bug inside, see bug 125184) -> regression
Comment 3 Roman Kuznetsov 2019-05-11 17:28:06 UTC
I bisected it:

:~/soft/logit/bibisect-linux-64-6.3$ git bisect bad ed24728bf337104f290702804090f84acfd9ce21 is the first bad commit
commit ed24728bf337104f290702804090f84acfd9ce21
Author: Jenkins Build User <tdf@pollux.tdf>
Date:   Tue Feb 19 19:47:34 2019 +0100

    source 65559252f138aada7a55d3c5fe0a932a222d13e0



https://git.libreoffice.org/core/+/65559252f138aada7a55d3c5fe0a932a222d13e0%5E%21

Add CC: to Tor Lillqvist
Comment 4 How can I remove my account? 2019-05-21 09:27:52 UTC
Do other .dot files work fine? Is this one special in ay way? very old for instance?
Comment 5 Roman Kuznetsov 2019-05-21 09:33:54 UTC
(In reply to Tor Lillqvist from comment #4)
> Do other .dot files work fine? Is this one special in ay way? very old for
> instance?

I tried open dot file that I created right now in MS Word 2010. It opens fine in LO

Version: 6.3.0.0.alpha1+
Build ID: 6d6277f23337c8eae9acabdf830e33fcc3ee9923
CPU threads: 4; OS: Windows 6.1; UI render: default; VCL: win; 
Locale: ru-RU (ru_RU); UI-Language: en-US
Calc: threaded

but DOT file from attach gives error anyway
Comment 6 Xisco Faulí 2019-05-21 09:36:35 UTC
it works fine if the attached file is changed to .doc
Comment 7 How can I remove my account? 2019-05-21 09:55:41 UTC
So it isn't actually a .dot file, it is a .doc file...
Comment 8 How can I remove my account? 2019-05-21 10:00:01 UTC
And indeed, if I create a dummy Writer document in LibreOffice, save it as "Word 97-2003 (.doc)", and rename the saved .doc file to .dot, and try to open that in LibreOffice, I get the same error message.
Comment 9 How can I remove my account? 2019-05-21 10:24:17 UTC
Roman: What is by the way your expected result here from opening that file? That it is opened as a normal document (which is what it *is*, even if the file name has the misleading extension .dot), or that it is used as a template (i.e. when opened, the document gets a name like "Untitled 1" and has no associated file), even if it isn't actually a template document format?
Comment 10 Roman Kuznetsov 2019-05-21 11:40:26 UTC
(In reply to Tor Lillqvist from comment #9)
> Roman: What is by the way your expected result here from opening that file?

this 

> that it is used as a
> template (i.e. when opened, the document gets a name like "Untitled 1" and
> has no associated file), even if it isn't actually a template document
> format?

because for users it's a template and this file opens in 6.1 as template
Comment 11 Xisco Faulí 2019-05-23 14:02:25 UTC
after the recent comments, I don't think this is a highest/critical bug anymore.
Should we just change the error message to be more informative to the end user?

@Miklos, any opinion here ?
Comment 12 Miklos Vajna 2019-05-23 14:33:09 UTC
I think the current setting is OK; it's a regression if it used to work, but the use-case is "interesting", so not really a priority. :-)
Comment 13 Aron Budea 2020-02-01 09:51:36 UTC
*** Bug 130308 has been marked as a duplicate of this bug. ***
Comment 14 Aron Budea 2020-02-01 09:53:49 UTC
Some interesting comments from Julien, bug 130308 comment 5 and bug 130308 comment 6:
"About the bug itself, I'm giving a look at https://interoperability.blob.core.windows.net/files/MS-DOC/%5bMS-DOC%5d-190319.pdf, trying to understand how FibBase is built."

"Cor: just for the test, I opened the file on Word 365 and saved it (without any changed on a new file).

Here's hexdump of extract FibBase struct(see my previous comment) before:
EC A5 C1 00 63 00 13 04 00 00 F0

Here's hexdump of extract FibBase struct after:
EC A5 C1 00 6F 00 09 04 00 00 F1

"F1" = 11110001
So "fDot" (the bit tested in Tor's patch) = 1, here (contrary to the initial doc) and the new dot file can be opened on LO.

I think I already read something about MsOffice file validation but not sure. The idea would be to validate dot file provided by the website."
Comment 15 Julien Nabet 2020-02-01 10:41:44 UTC
Indeed, in Roman's file, we can see:
000047D8   00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  EC A5 01 01  
00004804   4D 20 09 04  00 00 F0 12  BF 00 00 00  00 00 00 30  00 00 00 00  00 08 00 00  70 09 00 00  0E 00 43 61  6F 6C 61 6E  38 30 00 00  00 00 00 00 

So F0 again (so fDot = 0)
I wonder how these files have been generated. If they were created from MSOffice without extension renaming, it would mean Microsoft doesn't even respect his proper specs (knowing that dot file is quite an old format) (or did I miss something?)
Anyway, IMHO Tor's patch is correct and shouldn't be reverted.

Perhaps to workaround this, perhaps should we test extension (with case unsensitive) after:
bIsDetected = ((aBits1 & 0x01) == 0x01);
(see https://opengrok.libreoffice.org/xref/core/sw/source/ui/uno/swdetect.cxx?r=eaeabd78#117)
in case bIsDetected is false ?
(of course it would be ugly too)
Comment 16 QA Administrators 2022-02-01 04:45:52 UTC Comment hidden (obsolete)
Comment 17 Roman Kuznetsov 2022-02-01 08:24:50 UTC
Still repro in

Version: 7.4.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: eb69767d7c1bb8e6e780fd9503f08c9d7f5ecb45
CPU threads: 8; OS: Windows 10.0 Build 19043; UI render: Skia/Raster; VCL: win
Locale: ru-RU (ru_RU); UI: ru-RU
Calc: threaded
Comment 18 Justin L 2022-09-28 02:40:40 UTC
proposed fix at https://gerrit.libreoffice.org/c/core/+/140687
Comment 19 Commit Notification 2022-09-28 10:11:36 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/6cec8ba33a28de7248861b2eecfc5034cbde9d37

tdf#125216 import filter: allow .doc renamed as .dot

It will be available in 7.5.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 20 Roman Kuznetsov 2022-09-30 11:29:47 UTC
Now LO opens the file as should

Verified in

Version: 7.5.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: 48b9cbc742de3f6120986cb6cafc92eb5009da82
CPU threads: 4; OS: Windows 10.0 Build 19043; UI render: Skia/Raster; VCL: win
Locale: ru-RU (ru_RU); UI: ru-RU
Calc: threaded

Justin, thank you for the patch!
Comment 21 Commit Notification 2022-10-04 09:12:24 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "libreoffice-7-4":

https://git.libreoffice.org/core/commit/a814eecea816a00b1e1d796381c8f33bb51bdfc5

tdf#125216 import filter: allow .doc renamed as .dot

It will be available in 7.4.3.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.