Bug 88701 - FILEOPEN: Problem with import of HTML/ MHTML files with extension XLS on Windows
Summary: FILEOPEN: Problem with import of HTML/ MHTML files with extension XLS on Windows
Status: RESOLVED DUPLICATE of bug 83601
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: HTML-Import
  Show dependency treegraph
 
Reported: 2015-01-22 12:23 UTC by Ajay Pal Singh Atwal
Modified: 2022-03-14 10:58 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
HTML file with MIME header with XLS extension (8.30 KB, application/vnd.ms-excel)
2015-01-22 12:23 UTC, Ajay Pal Singh Atwal
Details
File in LibreOffice 3.3.0 on Mac OSX (914.48 KB, image/png)
2016-09-20 11:47 UTC, Ajay Pal Singh Atwal
Details
File in LibreOffice 5.2.1.4 on Mac OSX (853.55 KB, image/png)
2016-09-20 11:56 UTC, Ajay Pal Singh Atwal
Details
Prinstcreen from MS Office 2016 (19.10 KB, image/png)
2021-03-01 09:29 UTC, Svatopluk Vít
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ajay Pal Singh Atwal 2015-01-22 12:23:56 UTC
Created attachment 112661 [details]
HTML file with MIME header with XLS extension

A particular brand of ERP software is being used for generating reports in our organizations. It has an Export of Excel button. On the web based interface of the ERP the report can be downloaded as an XLS spreadsheet.
The XLS file is actually an HTML file with an XLS extension.

File contains a MIME header and then HTML tags for tables etc. Example file is attached.

This file if opened on Ubuntu GNU/ Linux with LibreOffice 4.2.7.2 ; can be imported as calc table. A little annoyance is the mime header at the top of file.

On Windows versions (64bit Windows 7) (I have tried 4.3 and 4.4 RC1 both) the import dialog displays the file contents as HTML text and prompts to import it as TAB, SPACE delimited text file. Rendering the file imported data unusable.

On renaming the file as .html the same can be imported as calc table with the little annoyance of mime header on top.

There were other bug reports with broken HTML as XLS, this one has MIME headers before start of HTML
Comment 1 Buovjaga 2015-03-06 12:28:49 UTC
Tries to import as CSV both in Linux & Windows.

Win 7 Pro 64-bit, LibO Version: 4.4.1.2
Build ID: 45e2de17089c24a1fa810c8f975a7171ba4cd432
Locale: fi_FI

Ubuntu 14.10 64-bit 
Version: 4.4.1.2
Build ID: 40m0(Build:2)
Locale: en_US
Comment 2 Yousuf Philips (jay) (retired) 2015-03-06 12:56:26 UTC
Hello Ajay,

As excel 2003 and 2013 are able to import it correctly by stripping away the mime information, it would be good for calc to do the same.
Comment 3 Ajay Pal Singh Atwal 2015-03-08 21:39:10 UTC
In case it helps, the ERP software from which such file can be exported, is SAP EP/ BI
Comment 4 Ajay Pal Singh Atwal 2015-05-15 06:16:26 UTC
The file is MHTML format and not HTML.
See: http://en.wikipedia.org/wiki/MHTML

Someone from our organisation asked the ERP vendor Support and this is their terse response:
-----------
Libre Office is not supported by SAP
-----------
As the support guy was able to find ****only one note**** about Libre Office and assumed it is not being used and hence not supported. (Rolling Eyes)

In another related communication some other SAP notes 1517552 and 1178858 have been referred

Relevant section of note 1517552 reproduced below:
-------------------------------------------------
During the 'Export to Excel' and 'Export to Excel 2000' functions, the file generated is internally an MHTML file, while the file extension is 'set' to .xls during the export


Relevant section of note 1178858 reproduced below:
-------------------------------------------------
The export to Excel function is supported as of Excel 2003. It generates an XHTML file in the Multi Mime format. This means that Mimes (for example, icons and screens) are stored in the file.


Also note that MS Excel 2007 onwards a warning is display for such files thta it is not in correct format but excel is able to import MHTML files
See: https://support.microsoft.com/en-us/kb/948615#top
Comment 5 Ajay Pal Singh Atwal 2015-05-15 06:46:04 UTC
This also seems relevant
https://bz.apache.org/ooo/show_bug.cgi?id=101436
Comment 6 QA Administrators 2016-09-20 09:42:11 UTC Comment hidden (obsolete)
Comment 7 Ajay Pal Singh Atwal 2016-09-20 11:47:50 UTC
Created attachment 127459 [details]
File in LibreOffice 3.3.0 on Mac OSX

This is how this file looks when opened in in LibreOffice 3.3.0 on Mac OSX 10.11.6
Comment 8 Ajay Pal Singh Atwal 2016-09-20 11:56:05 UTC
Created attachment 127460 [details]
File in LibreOffice 5.2.1.4 on Mac OSX

File when opened in LibreOffice 5.2.1.4 on Mac OSX
Comment 9 Buovjaga 2017-11-10 13:41:14 UTC
*** Bug 106856 has been marked as a duplicate of this bug. ***
Comment 10 QA Administrators 2018-11-11 03:46:57 UTC Comment hidden (obsolete)
Comment 11 Svatopluk Vít 2021-03-01 09:29:23 UTC
Created attachment 170144 [details]
Prinstcreen from MS Office 2016

This is printscreen of the file imported to MSO 2016
Comment 12 Svatopluk Vít 2021-03-01 09:30:00 UTC
The bug is still present.

Version: 7.1.1.1 (x64) / LibreOffice Community
Build ID: 575c5867c4cc13d7ae78f9ce39a54a52ed38c769
CPU threads: 4; OS: Windows 10.0 Build 19042; UI render: Skia/Vulkan; VCL: win
Locale: cs-CZ (cs_CZ); UI: cs-CZ
Calc: threaded
Comment 13 Mike Kaganski 2022-03-14 10:58:56 UTC

*** This bug has been marked as a duplicate of bug 83601 ***