Bug 67699 - FILEOPEN: html table as XLS gives general I/O error
Summary: FILEOPEN: html table as XLS gives general I/O error
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
4.1.0.3 rc
Hardware: All All
: high major
Assignee: Kohei Yoshida
URL:
Whiteboard: target:4.2.0 target:4.1.2
Keywords: regression
: 67617 67902 68269 68826 68837 68941 69031 69032 69263 69677 (view as bug list)
Depends on:
Blocks:
 
Reported: 2013-08-03 07:25 UTC by skully
Modified: 2014-04-20 19:30 UTC (History)
16 users (show)

See Also:
Crash report or crash signature:


Attachments
html file with a table with xls extension (1.44 KB, text/html)
2013-08-03 07:25 UTC, skully
Details
another .xls testcase with only a <table> fragment that opens differently (192 bytes, application/vnd.ms-excel)
2013-08-19 16:12 UTC, Eike Rathke
Details
Result open file like html Calc (279.42 KB, image/jpeg)
2013-09-10 11:19 UTC, Ruslan Fatakhov
Details

Note You need to log in before you can comment on or make changes to this bug.
Description skully 2013-08-03 07:25:44 UTC
Created attachment 83566 [details]
html file with a table with xls extension

I used version 3.5.7 on my mac (and Ubuntu 12.04) and i could open the attached XLS file just fine. Its an html file with a table inside which has the extension xls. This versions shows a regular table inside Calc.

After an update on version 4.1.0, i get an "general read output error" when trying to open the same.

Sure, this is not a valid xls file, but still this looks like a bug/regression to me.
LibreOffice should at least handle the file like an html file if it doesnt want to make the convert step anymore.
Comment 1 Mike Kaganski 2013-08-04 00:09:14 UTC
Reproducible with LO 4.1.0.3 under Win7x64.
Not reproducible with LO 4.1.0.2 (it opens the test file in Writer).

To reproduce, the html file must have extension .xls

Most probably related to Bug 67617.
Comment 2 Maxim Monastirsky 2013-08-19 12:45:19 UTC
*** Bug 68269 has been marked as a duplicate of this bug. ***
Comment 3 Eike Rathke 2013-08-19 15:29:45 UTC
This now seems to happen with any file that has the extension .xls but has HTML content.

@Kohei:
In SfxBaseModel::load() the resulting filter name in aFilterName is "calc8_template" (?? how that?) and then in SfxMedium::GetStorage() the pImp->xStorage = ... fails in OStorageFactory::createInstanceWithArguments() because StorageFormat is ZipFormat and after CheckPackageSignature_Impl() fails io::IOException() is thrown.
Comment 4 Eike Rathke 2013-08-19 16:03:50 UTC
Similar as mentioned in https://bugs.freedesktop.org/show_bug.cgi?id=67617#c3 this error is triggered by d1fc3fce16172d7d619b6826de44528030ab36f8 fdo#64448: Don't get type name from incorrect filter.

However, reverting that the here attached file opens in Writer instead of Calc.
Comment 5 Eike Rathke 2013-08-19 16:12:38 UTC
Created attachment 84272 [details]
another .xls testcase with only a <table> fragment that opens differently

BUT, reverting d1fc3fce16172d7d619b6826de44528030ab36f8 makes this file, containing only a <table>, open fine in Calc.
Comment 6 Eike Rathke 2013-08-19 16:29:33 UTC
As there are piles of generated HTML files with .xls extension in the wild I'd rather revert
http://cgit.freedesktop.org/libreoffice/core/commit/?id=d1fc3fce16172d7d619b6826de44528030ab36f8
http://cgit.freedesktop.org/libreoffice/core/commit/?id=fa965d8b1743d786ea07d887f883ab9af9b6652e&h=libreoffice-4-1

also for 4-1-1, and reopen bug 64448 for the corner case of a file without extension instead.
Comment 7 Eike Rathke 2013-08-19 16:37:22 UTC
ok.. talked to Kohei and reverting is a bad option.
Comment 8 Kohei Yoshida 2013-08-19 19:26:23 UTC
I have a fix for this, but this is a major change. And you know as well as I do that every major change runs the risk of introducing regressions...

Having said that, what I did was essentially to remove all the previously accumulated hacks, which made this detection code so hard to debug and maintain. So, even if my change introduces another brand-new problem, it will be much easier to debug.  So, IMO it's worth the risk.
Comment 9 Kohei Yoshida 2013-08-19 19:26:53 UTC
I'll take it.
Comment 10 Commit Notification 2013-08-19 19:44:12 UTC
Kohei Yoshida committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=e69aa9572bb2206313cd2aa7edd13da91460f2c4

fdo#67699: Remove a whole bunch of old hacks.



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 11 Kohei Yoshida 2013-08-19 20:04:39 UTC
Backport request for 4.1 on gerrit: https://gerrit.libreoffice.org/#/c/5518/
Comment 12 Commit Notification 2013-08-19 20:43:40 UTC
Kohei Yoshida committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=904ef99d87af1bfefe43f6a84f04f019bd082754

fdo#67699: Don't forget to set filter name to the descriptor.



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 13 Commit Notification 2013-08-19 20:53:31 UTC
Kohei Yoshida committed a patch related to this issue.
It has been pushed to "libreoffice-4-1":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=ac9cee0d909ba580a4128f34b675e9f58794ea97&h=libreoffice-4-1

fdo#67699: Remove a whole bunch of old hacks.


It will be available in LibreOffice 4.1.2.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 14 Kohei Yoshida 2013-08-19 21:00:22 UTC
Please merge this too.

https://gerrit.libreoffice.org/#/c/5520/
Comment 15 Kohei Yoshida 2013-08-20 12:21:09 UTC
fixed.
Comment 16 Caolán McNamara 2013-08-20 13:21:43 UTC
*** Bug 67617 has been marked as a duplicate of this bug. ***
Comment 17 Maxim Monastirsky 2013-09-02 09:50:20 UTC
*** Bug 68837 has been marked as a duplicate of this bug. ***
Comment 18 Maxim Monastirsky 2013-09-08 11:26:11 UTC
*** Bug 67902 has been marked as a duplicate of this bug. ***
Comment 19 Maxim Monastirsky 2013-09-08 11:28:54 UTC
*** Bug 68941 has been marked as a duplicate of this bug. ***
Comment 20 Maxim Monastirsky 2013-09-08 11:34:02 UTC
*** Bug 69032 has been marked as a duplicate of this bug. ***
Comment 21 Maxim Monastirsky 2013-09-09 14:51:12 UTC
*** Bug 68826 has been marked as a duplicate of this bug. ***
Comment 22 Eike Rathke 2013-09-10 09:37:14 UTC
*** Bug 69163 has been marked as a duplicate of this bug. ***
Comment 23 Eike Rathke 2013-09-10 09:41:39 UTC
Just to mention a workaround until 4.1.2 is out:

You could rename the file to .html extension and in the file open dialog explicitly select the "HTML Document (Calc)" file type somewhere in the middle of that list after the Microsoft Excel types. Note that if you don't select the file type an attempt is made to open the file in Writer instead.
Comment 24 Ruslan Fatakhov 2013-09-10 11:19:28 UTC
Created attachment 85546 [details]
Result open file like html Calc
Comment 25 Ruslan Fatakhov 2013-09-10 11:23:16 UTC
Hello, i tried open like you suggested me. But LO Calc opens it like xml code. See attach "Result open file like html Calc"

(In reply to comment #23)
> Just to mention a workaround until 4.1.2 is out:
> 
> You could rename the file to .html extension and in the file open dialog
> explicitly select the "HTML Document (Calc)" file type somewhere in the
> middle of that list after the Microsoft Excel types. Note that if you don't
> select the file type an attempt is made to open the file in Writer instead.

(In reply to comment #23)
> Just to mention a workaround until 4.1.2 is out:
> 
> You could rename the file to .html extension and in the file open dialog
> explicitly select the "HTML Document (Calc)" file type somewhere in the
> middle of that list after the Microsoft Excel types. Note that if you don't
> select the file type an attempt is made to open the file in Writer instead.
Comment 26 Eike Rathke 2013-09-10 23:10:01 UTC
Then that file is not an HTML but an XML file and of course opening it explicitly as HTML does not produce the results you expect.
Comment 27 Maxim Monastirsky 2013-09-12 11:16:44 UTC
*** Bug 69263 has been marked as a duplicate of this bug. ***
Comment 28 Maxim Monastirsky 2013-09-22 14:35:44 UTC
*** Bug 69677 has been marked as a duplicate of this bug. ***
Comment 29 Maxim Monastirsky 2013-09-22 14:55:43 UTC
*** Bug 69031 has been marked as a duplicate of this bug. ***
Comment 30 Mike Kaganski 2013-09-23 04:48:45 UTC
*** Bug 67478 has been marked as a duplicate of this bug. ***
Comment 31 Mike Kaganski 2013-09-23 04:56:17 UTC
*** Bug 61546 has been marked as a duplicate of this bug. ***
Comment 32 Mike Kaganski 2013-09-23 05:01:21 UTC
*** Bug 65980 has been marked as a duplicate of this bug. ***