Download it now!
Bug 125216 - FILEOPEN DOC file with extension .dot gives error "Read Error - This is not a valid WinWord6 File"
Summary: FILEOPEN DOC file with extension .dot gives error "Read Error - This is not a...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
6.2.2.2 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: bibisected, bisected, regression
: 130308 (view as bug list)
Depends on:
Blocks: DOC-Opening
  Show dependency treegraph
 
Reported: 2019-05-11 17:12 UTC by Roman Kuznetsov
Modified: 2020-02-01 10:41 UTC (History)
8 users (show)

See Also:
Crash report or crash signature:


Attachments
Problem DOT file (26.50 KB, application/msword)
2019-05-11 17:13 UTC, Roman Kuznetsov
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Roman Kuznetsov 2019-05-11 17:12:41 UTC
Description:
can't open DOT file

This bug follows from bug 125184

Steps to Reproduce:
1. Try open DOT file from attach
2. File doesn't open with error "File isn't file Winword6"
3.

Actual Results:
File doesn't open

Expected Results:
File opens fine


Reproducible: Always


User Profile Reset: No



Additional Info:
Comment 1 Roman Kuznetsov 2019-05-11 17:13:15 UTC
Created attachment 151311 [details]
Problem DOT file

Версия: 6.2.2.1 (x64)
ID сборки: fcd633fb1bf21b0a99c9acb3ad6e526437947b01
Потоков ЦП: 4; ОС:Windows 10.0; Отрисовка ИП: GL; VCL: win; 
Локаль: ru-RU (ru_RU); Язык UI: ru-RU
Calc: threaded
Comment 2 Roman Kuznetsov 2019-05-11 17:14:59 UTC
file opens in 6.1.4 (with another bug inside, see bug 125184) -> regression
Comment 3 Roman Kuznetsov 2019-05-11 17:28:06 UTC
I bisected it:

:~/soft/logit/bibisect-linux-64-6.3$ git bisect bad
ed24728bf337104f290702804090f84acfd9ce21 is the first bad commit
commit ed24728bf337104f290702804090f84acfd9ce21
Author: Jenkins Build User <tdf@pollux.tdf>
Date:   Tue Feb 19 19:47:34 2019 +0100

    source sha:65559252f138aada7a55d3c5fe0a932a222d13e0



https://git.libreoffice.org/core/+/65559252f138aada7a55d3c5fe0a932a222d13e0%5E%21

Add CC: to Tor Lillqvist
Comment 4 Tor Lillqvist 2019-05-21 09:27:52 UTC
Do other .dot files work fine? Is this one special in ay way? very old for instance?
Comment 5 Roman Kuznetsov 2019-05-21 09:33:54 UTC
(In reply to Tor Lillqvist from comment #4)
> Do other .dot files work fine? Is this one special in ay way? very old for
> instance?

I tried open dot file that I created right now in MS Word 2010. It opens fine in LO

Version: 6.3.0.0.alpha1+
Build ID: 6d6277f23337c8eae9acabdf830e33fcc3ee9923
CPU threads: 4; OS: Windows 6.1; UI render: default; VCL: win; 
Locale: ru-RU (ru_RU); UI-Language: en-US
Calc: threaded

but DOT file from attach gives error anyway
Comment 6 Xisco Faulí 2019-05-21 09:36:35 UTC
it works fine if the attached file is changed to .doc
Comment 7 Tor Lillqvist 2019-05-21 09:55:41 UTC
So it isn't actually a .dot file, it is a .doc file...
Comment 8 Tor Lillqvist 2019-05-21 10:00:01 UTC
And indeed, if I create a dummy Writer document in LibreOffice, save it as "Word 97-2003 (.doc)", and rename the saved .doc file to .dot, and try to open that in LibreOffice, I get the same error message.
Comment 9 Tor Lillqvist 2019-05-21 10:24:17 UTC
Roman: What is by the way your expected result here from opening that file? That it is opened as a normal document (which is what it *is*, even if the file name has the misleading extension .dot), or that it is used as a template (i.e. when opened, the document gets a name like "Untitled 1" and has no associated file), even if it isn't actually a template document format?
Comment 10 Roman Kuznetsov 2019-05-21 11:40:26 UTC
(In reply to Tor Lillqvist from comment #9)
> Roman: What is by the way your expected result here from opening that file?

this 

> that it is used as a
> template (i.e. when opened, the document gets a name like "Untitled 1" and
> has no associated file), even if it isn't actually a template document
> format?

because for users it's a template and this file opens in 6.1 as template
Comment 11 Xisco Faulí 2019-05-23 14:02:25 UTC
after the recent comments, I don't think this is a highest/critical bug anymore.
Should we just change the error message to be more informative to the end user?

@Miklos, any opinion here ?
Comment 12 Miklos Vajna 2019-05-23 14:33:09 UTC
I think the current setting is OK; it's a regression if it used to work, but the use-case is "interesting", so not really a priority. :-)
Comment 13 Aron Budea 2020-02-01 09:51:36 UTC
*** Bug 130308 has been marked as a duplicate of this bug. ***
Comment 14 Aron Budea 2020-02-01 09:53:49 UTC
Some interesting comments from Julien, bug 130308 comment 5 and bug 130308 comment 6:
"About the bug itself, I'm giving a look at https://interoperability.blob.core.windows.net/files/MS-DOC/%5bMS-DOC%5d-190319.pdf, trying to understand how FibBase is built."

"Cor: just for the test, I opened the file on Word 365 and saved it (without any changed on a new file).

Here's hexdump of extract FibBase struct(see my previous comment) before:
EC A5 C1 00 63 00 13 04 00 00 F0

Here's hexdump of extract FibBase struct after:
EC A5 C1 00 6F 00 09 04 00 00 F1

"F1" = 11110001
So "fDot" (the bit tested in Tor's patch) = 1, here (contrary to the initial doc) and the new dot file can be opened on LO.

I think I already read something about MsOffice file validation but not sure. The idea would be to validate dot file provided by the website."
Comment 15 Julien Nabet 2020-02-01 10:41:44 UTC
Indeed, in Roman's file, we can see:
000047D8   00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  EC A5 01 01  
00004804   4D 20 09 04  00 00 F0 12  BF 00 00 00  00 00 00 30  00 00 00 00  00 08 00 00  70 09 00 00  0E 00 43 61  6F 6C 61 6E  38 30 00 00  00 00 00 00 

So F0 again (so fDot = 0)
I wonder how these files have been generated. If they were created from MSOffice without extension renaming, it would mean Microsoft doesn't even respect his proper specs (knowing that dot file is quite an old format) (or did I miss something?)
Anyway, IMHO Tor's patch is correct and shouldn't be reverted.

Perhaps to workaround this, perhaps should we test extension (with case unsensitive) after:
bIsDetected = ((aBits1 & 0x01) == 0x01);
(see https://opengrok.libreoffice.org/xref/core/sw/source/ui/uno/swdetect.cxx?r=eaeabd78#117)
in case bIsDetected is false ?
(of course it would be ugly too)