Bug 88701 - FILEOPEN: Problem with import of HTML/ MHTML files with extension XLS on Windows
Summary: FILEOPEN: Problem with import of HTML/ MHTML files with extension XLS on Windows
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
: 106856 (view as bug list)
Depends on:
Blocks: HTML-Import
  Show dependency treegraph
 
Reported: 2015-01-22 12:23 UTC by Ajay Pal Singh Atwal
Modified: 2020-07-16 09:47 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
HTML file with MIME header with XLS extension (8.30 KB, application/vnd.ms-excel)
2015-01-22 12:23 UTC, Ajay Pal Singh Atwal
Details
File in LibreOffice 3.3.0 on Mac OSX (914.48 KB, image/png)
2016-09-20 11:47 UTC, Ajay Pal Singh Atwal
Details
File in LibreOffice 5.2.1.4 on Mac OSX (853.55 KB, image/png)
2016-09-20 11:56 UTC, Ajay Pal Singh Atwal
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ajay Pal Singh Atwal 2015-01-22 12:23:56 UTC
Created attachment 112661 [details]
HTML file with MIME header with XLS extension

A particular brand of ERP software is being used for generating reports in our organizations. It has an Export of Excel button. On the web based interface of the ERP the report can be downloaded as an XLS spreadsheet.
The XLS file is actually an HTML file with an XLS extension.

File contains a MIME header and then HTML tags for tables etc. Example file is attached.

This file if opened on Ubuntu GNU/ Linux with LibreOffice 4.2.7.2 ; can be imported as calc table. A little annoyance is the mime header at the top of file.

On Windows versions (64bit Windows 7) (I have tried 4.3 and 4.4 RC1 both) the import dialog displays the file contents as HTML text and prompts to import it as TAB, SPACE delimited text file. Rendering the file imported data unusable.

On renaming the file as .html the same can be imported as calc table with the little annoyance of mime header on top.

There were other bug reports with broken HTML as XLS, this one has MIME headers before start of HTML
Comment 1 Buovjaga 2015-03-06 12:28:49 UTC
Tries to import as CSV both in Linux & Windows.

Win 7 Pro 64-bit, LibO Version: 4.4.1.2
Build ID: 45e2de17089c24a1fa810c8f975a7171ba4cd432
Locale: fi_FI

Ubuntu 14.10 64-bit 
Version: 4.4.1.2
Build ID: 40m0(Build:2)
Locale: en_US
Comment 2 Yousuf Philips (jay) (retired) 2015-03-06 12:56:26 UTC
Hello Ajay,

As excel 2003 and 2013 are able to import it correctly by stripping away the mime information, it would be good for calc to do the same.
Comment 3 Ajay Pal Singh Atwal 2015-03-08 21:39:10 UTC
In case it helps, the ERP software from which such file can be exported, is SAP EP/ BI
Comment 4 Ajay Pal Singh Atwal 2015-05-15 06:16:26 UTC
The file is MHTML format and not HTML.
See: http://en.wikipedia.org/wiki/MHTML

Someone from our organisation asked the ERP vendor Support and this is their terse response:
-----------
Libre Office is not supported by SAP
-----------
As the support guy was able to find ****only one note**** about Libre Office and assumed it is not being used and hence not supported. (Rolling Eyes)

In another related communication some other SAP notes 1517552 and 1178858 have been referred

Relevant section of note 1517552 reproduced below:
-------------------------------------------------
During the 'Export to Excel' and 'Export to Excel 2000' functions, the file generated is internally an MHTML file, while the file extension is 'set' to .xls during the export


Relevant section of note 1178858 reproduced below:
-------------------------------------------------
The export to Excel function is supported as of Excel 2003. It generates an XHTML file in the Multi Mime format. This means that Mimes (for example, icons and screens) are stored in the file.


Also note that MS Excel 2007 onwards a warning is display for such files thta it is not in correct format but excel is able to import MHTML files
See: https://support.microsoft.com/en-us/kb/948615#top
Comment 5 Ajay Pal Singh Atwal 2015-05-15 06:46:04 UTC
This also seems relevant
https://bz.apache.org/ooo/show_bug.cgi?id=101436
Comment 6 QA Administrators 2016-09-20 09:42:11 UTC Comment hidden (obsolete)
Comment 7 Ajay Pal Singh Atwal 2016-09-20 11:47:50 UTC
Created attachment 127459 [details]
File in LibreOffice 3.3.0 on Mac OSX

This is how this file looks when opened in in LibreOffice 3.3.0 on Mac OSX 10.11.6
Comment 8 Ajay Pal Singh Atwal 2016-09-20 11:56:05 UTC
Created attachment 127460 [details]
File in LibreOffice 5.2.1.4 on Mac OSX

File when opened in LibreOffice 5.2.1.4 on Mac OSX
Comment 9 Buovjaga 2017-11-10 13:41:14 UTC
*** Bug 106856 has been marked as a duplicate of this bug. ***
Comment 10 QA Administrators 2018-11-11 03:46:57 UTC
** Please read this message in its entirety before responding **

To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year.

There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present.

If you have time, please do the following:

Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/

If the bug is present, please leave a comment that includes the information from Help - About LibreOffice.
 
If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice.

Please DO NOT

Update the version field
Reply via email (please reply directly on the bug tracker)
Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not 
appropriate in this case)


If you want to do more to help you can test to see if your issue is a REGRESSION. To do so:
1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from http://downloadarchive.documentfoundation.org/libreoffice/old/

2. Test your bug
3. Leave a comment with your results.
4a. If the bug was present with 3.3 - set version to 'inherited from OOo';
4b. If the bug was not present in 3.3 - add 'regression' to keyword


Feel free to come ask questions or to say hello in our QA chat: https://kiwiirc.com/nextclient/irc.freenode.net/#libreoffice-qa

Thank you for helping us make LibreOffice even better for everyone!

Warm Regards,
QA Team

MassPing-UntouchedBug