Bug 46233 - Some xls files with html content is opened incorrectly
Summary: Some xls files with html content is opened incorrectly
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
(earliest affected)
3.5.0 release
Hardware: All All
: medium minor
Assignee: Not Assigned
QA Contact:
: 50046 (view as bug list)
Depends on:
Reported: 2012-02-17 07:33 UTC by Mikhail Vladimirov
Modified: 2016-04-17 16:14 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:

Bank's xls info file (29.25 KB, application/vnd.ms-excel)
2012-02-17 07:33 UTC, Mikhail Vladimirov
test file by gracz@npsh.hu (1.75 KB, application/vnd.ms-excel)
2014-07-07 12:55 UTC, Gergely Rácz
How the file is opened in LibreOffice 5 (96.82 KB, image/png)
2016-02-27 08:28 UTC, Nicola Ruggero
How the file is opened in MS Excel 2010 (57.72 KB, image/png)
2016-02-27 08:29 UTC, Nicola Ruggero

Note You need to log in before you can comment on or make changes to this bug.
Description Mikhail Vladimirov 2012-02-17 07:33:30 UTC
Created attachment 57213 [details]
Bank's xls info file

Just look into attached file!
Open it in MS Word 2003 and do the same in LibreOffice 3.5!
See the great difference :-(((
Comment 1 Mikhail Vladimirov 2012-02-17 07:37:32 UTC
Seems that everything looks good but it is not what I expected from 3.5 release!
Comment 2 Rainer Bielefeld Retired 2012-05-08 07:02:22 UTC
Does not look good is a too vague description.

With a WIN 3.5.0 RC I see a rather useful spreadsheet (see screenshot comparison), "LibreOffice (RC2) German UI/Locale [Build-ID: 235ab8a-3802056-4a8fed3-2d66ea8-e241b80] on German WIN7 Home Premium (64bit) shows text source with lots of HTML tags.

Thank you for your report – unfortunately all relevant information is missing.
May be hints on <http://wiki.documentfoundation.org/BugReport> will help you to find out what information will be useful to reproduce your problem? If you believe that that  is really sophisticated please as for Help on a user mailing list
- Write a meaningful Summary describing exactly what the problem is
- Attach screenshots with comments comparing expected view and view in your 
  LibO versionif you. Best way is to insert your screenshots
  into a DRAW document and to add comments that explain what you want to show
- Contribute a step by step instruction containing every key press and every 
  mouse click how to reproduce your problem (due to example in Bug 43431)
– if possible contribute an instruction how to create a sample document 
  from the scratch
- add information 
  -- what EXACTLY is unexpected
  -- and WHY do you believe it's unexpected (cite Help or Documentation!)
  -- concerning your PC
  -- concerning your OS (Version, Distribution, Language)
  -- concerning your LibO version (with Build ID if it's not a public release)
     and localization (UI language, Locale setting)
  –- Libo settings that might be related to your problems 
  -- how you launch LibO and how you opened the sample document
  –- If you can contribute an OOo Issue that might be useful
  -- everything else crossing your mind after you read linked texts

Even if you can not provide all demanded information, every little new information might bring the breakthrough. Is your problem the one visible in screenshots?

May be you can test <https://www.libreoffice.org/get-help/bug/> for submitting bug reports?

Please file Bug reports with status UNCONFIRMED if your are not absolutely sure that you contributed all required background information, that the problem will be reproducible with information you can provide or that your enhancement request will be accepted! Thank you!
Comment 3 Björn Michaelsen 2012-05-09 18:27:21 UTC
User assumes xls-file containing html-data to be read not as CSV. I would assume that defaulting to import as CSV in Calc is sane in cases where the file could contain multiple forms of data (while storing html in a .xls-File clearly isnt).

Dropping importance and severity for pathogenic cornercase and adding regression keyword.

CC'ing erack for maybe thinking of an even more sophisticated DWIM-logic for a further 3.5 release or closing as WONTFIX by own judgeing.
Comment 4 Eike Rathke 2012-05-10 05:28:40 UTC
Setting the correct release version (here regression in 3.5.3) sometimes helps the developer to not waste time ...
Comment 5 Not Assigned 2012-05-10 07:23:05 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":


resolved fdo#46233 value >12 with AM/PM can't be clock time
Comment 6 Eike Rathke 2012-05-10 07:32:55 UTC
Damn, that was a fix for bug 47149 instead.
Comment 7 Rainer Bielefeld Retired 2012-05-10 07:36:03 UTC
"Bug 49639 - FILEOPEN html content .xls files shows text.csv with html tags instead of Spreadsheet contents" seems to be a DUP?!
Comment 8 Eike Rathke 2012-05-10 08:05:08 UTC
*** Bug 49639 has been marked as a duplicate of this bug. ***
Comment 9 Eike Rathke 2012-05-10 08:58:41 UTC
Actually bug 49639 is not a duplicate of this.
1. The submitter talks about difference between MS Word 2003 and
   LibreOffice 3.5, so this does not seem to be Calc.
2. Submitted on 2012-02-17 it can't be the regression of bug 49639
   introduced with 3.5.3
3. Bjoern's comment #3 is a wrong assumption then because in 3.5.0 the
   file wasn't imported as CSV.

=> Submitter should clarify what he actually expected.
Removing regression keyword, setting NEEDINFO, back to default owner.
Comment 10 Rainer Bielefeld Retired 2012-05-20 09:33:40 UTC
*** Bug 50046 has been marked as a duplicate of this bug. ***
Comment 11 Andrej 2012-05-31 01:26:45 UTC
Have the same problem.
My spreadsheet is generated as HTML on web server, it is given extension .xls.
This file can be automatically opened by MS Office, OpenOffice, and LibreOffice~<3.5.3. LibreOffice3.5.3 handles these files as plain text files.
Historically, spreadsheet applications can save spreadsheets in HTML format, and automatically open these files, if they are given .xls extension.
Comment 12 Eike Rathke 2012-06-01 02:45:26 UTC
Please don't change the Version field to newer, it indicates in which version the problem was first perceived.

Andrej, it seems your problem is a different one, as you indicated it worked for versions <3.5.3 it sounds pretty much like bug 49639, please check if release 3.5.4 fixes that for you.
Comment 13 Gergely Rácz 2014-07-07 12:54:23 UTC
I have experienced the same or similar problem. I have a generated .xls file with HTML content (see attached file: test_gracz.xls).

- OS: win7
- LibO:

Steps to reproduce:
1. Open the attached file (test_gracz.xls)
2. Choose one of the option (it is irrelevant) from the "Import option" pop up window.

Few random character is generated in cell A1.

Expected result:
LibO should import the file correctly.
Comment 14 Gergely Rácz 2014-07-07 12:55:14 UTC
Created attachment 102372 [details]
test file by gracz@npsh.hu
Comment 15 Maxim Monastirsky 2014-07-07 13:52:27 UTC
(In reply to comment #13)
> I have experienced the same or similar problem.
No, that's another problem. Please open a new bug for it.

> Result:
> Few random character is generated in cell A1.
Those aren't "random" characters, but the UTF-8 BOM (See http://en.wikipedia.org/wiki/Byte_order_mark#UTF-8).

Actually this bug should be in NEEDINFO status, because the original reporter didn't answer Eike's question (comment 9).
Comment 16 QA Administrators 2015-02-19 04:33:53 UTC
Dear Bug Submitter,

This bug has been in NEEDINFO status with no change for at least 6 months. Please provide the requested information as soon as possible and mark the bug as UNCONFIRMED. Due to regular bug tracker maintenance, if the bug is still in NEEDINFO status with no change in 30 days the QA team will close the bug as INVALID due to lack of needed information.

For more information about our NEEDINFO policy please read the wiki located here: 

If you have already provided the requested information, please mark the bug as UNCONFIRMED so that the QA team knows that the bug is ready to be confirmed.

Thank you for helping us make LibreOffice even better for everyone!

Warm Regards,
QA Team

Message generated on: 2015-02-18
Comment 17 QA Administrators 2015-04-01 14:51:41 UTC
Dear Bug Submitter,

Please read this message in its entirety before proceeding.

Your bug report is being closed as INVALID due to inactivity and
a lack of information which is needed in order to accurately
reproduce and confirm the problem. We encourage you to retest
your bug against the latest release. If the issue is still
present in the latest stable release, we need the following
information (please ignore any that you've already provided):

a) Provide details of your system including your operating
   system and the latest version of LibreOffice that you have
   confirmed the bug to be present

b) Provide easy to reproduce steps – the simpler the better

c) Provide any test case(s) which will help us confirm the problem

d) Provide screenshots of the problem if you think it might help

e) Read all comments and provide any requested information

Once all of this is done, please set the bug back to UNCONFIRMED
and we will attempt to reproduce the issue. Please do not:

a) respond via email 

b) update the version field in the bug or any of the other details
   on the top section of our bug tracker

-- The LibreOffice QA Team This NEEDINFO Message was generated on: 2015-04-01

Warm Regards,
QA Team
Comment 18 Nicola Ruggero 2016-02-27 08:25:34 UTC
the issue is still present in LO:

We expect that LO opens that files without the import dialog as for files like plain_text/html/csv/etc.

I will post the related screenshot
Comment 19 Nicola Ruggero 2016-02-27 08:28:10 UTC
Created attachment 123025 [details]
How the file is opened in LibreOffice 5

Here how the file is opened in LibreOffice 5.
An import dialog appears first, then it show the content like plain text.
Rendered layout and text format is completely different from MS Excel 2010
Comment 20 Nicola Ruggero 2016-02-27 08:29:06 UTC
Created attachment 123026 [details]
How the file is opened in MS Excel 2010

Here how the file is opened in MS Excel 2010.
Comment 21 Mikhail Vladimirov 2016-02-27 16:35:22 UTC
Still the same ugly picture!
Comment 22 Mikhail Vladimirov 2016-03-03 12:24:13 UTC
Still Reproducible!


Comment 23 mahfiaz 2016-04-17 16:14:16 UTC
The bank which saves HTML files and names them .xls is incompetent, although it's the lazy way to tell Excel to open it.
What puzzles me is why it gets opened in Word (original reporter in first comment).

Now if you really want to use that file, rename it to .html and open it with Writer (or skip the renaming part). If you for whatever reason really need the data in Calc, copy it over.

I'd say still the same ugly file.