Bug 46233 - Some xls files with html content is opened incorrectly
Summary: Some xls files with html content is opened incorrectly
Status: RESOLVED WORKSFORME
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
3.5.0 release
Hardware: All All
: medium minor
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
: 50046 (view as bug list)
Depends on:
Blocks:
 
Reported: 2012-02-17 07:33 UTC by Mikhail Vladimirov
Modified: 2017-11-10 13:40 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
Bank's xls info file (29.25 KB, application/vnd.ms-excel)
2012-02-17 07:33 UTC, Mikhail Vladimirov
Details
test file by gracz@npsh.hu (1.75 KB, application/vnd.ms-excel)
2014-07-07 12:55 UTC, Gergely Rácz
Details
How the file is opened in LibreOffice 5 (96.82 KB, image/png)
2016-02-27 08:28 UTC, Nicola Ruggero
Details
How the file is opened in MS Excel 2010 (57.72 KB, image/png)
2016-02-27 08:29 UTC, Nicola Ruggero
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mikhail Vladimirov 2012-02-17 07:33:30 UTC
Created attachment 57213 [details]
Bank's xls info file

Hi!
Just look into attached file!
Open it in MS Word 2003 and do the same in LibreOffice 3.5!
See the great difference :-(((
Comment 1 Mikhail Vladimirov 2012-02-17 07:37:32 UTC
Seems that everything looks good but it is not what I expected from 3.5 release!
Comment 2 Rainer Bielefeld Retired 2012-05-08 07:02:22 UTC
Does not look good is a too vague description.

With a WIN 3.5.0 RC I see a rather useful spreadsheet (see screenshot comparison), "LibreOffice 3.5.3.2 (RC2) German UI/Locale [Build-ID: 235ab8a-3802056-4a8fed3-2d66ea8-e241b80] on German WIN7 Home Premium (64bit) shows text source with lots of HTML tags.

@reporter:
Thank you for your report – unfortunately all relevant information is missing.
May be hints on <http://wiki.documentfoundation.org/BugReport> will help you to find out what information will be useful to reproduce your problem? If you believe that that  is really sophisticated please as for Help on a user mailing list
Please:
- Write a meaningful Summary describing exactly what the problem is
- Attach screenshots with comments comparing expected view and view in your 
  LibO versionif you. Best way is to insert your screenshots
  into a DRAW document and to add comments that explain what you want to show
- Contribute a step by step instruction containing every key press and every 
  mouse click how to reproduce your problem (due to example in Bug 43431)
– if possible contribute an instruction how to create a sample document 
  from the scratch
- add information 
  -- what EXACTLY is unexpected
  -- and WHY do you believe it's unexpected (cite Help or Documentation!)
  -- concerning your PC
  -- concerning your OS (Version, Distribution, Language)
  -- concerning your LibO version (with Build ID if it's not a public release)
     and localization (UI language, Locale setting)
  –- Libo settings that might be related to your problems 
  -- how you launch LibO and how you opened the sample document
  –- If you can contribute an OOo Issue that might be useful
  -- everything else crossing your mind after you read linked texts

Even if you can not provide all demanded information, every little new information might bring the breakthrough. Is your problem the one visible in screenshots?

May be you can test <https://www.libreoffice.org/get-help/bug/> for submitting bug reports?

Please file Bug reports with status UNCONFIRMED if your are not absolutely sure that you contributed all required background information, that the problem will be reproducible with information you can provide or that your enhancement request will be accepted! Thank you!
Comment 3 Björn Michaelsen 2012-05-09 18:27:21 UTC
User assumes xls-file containing html-data to be read not as CSV. I would assume that defaulting to import as CSV in Calc is sane in cases where the file could contain multiple forms of data (while storing html in a .xls-File clearly isnt).

Dropping importance and severity for pathogenic cornercase and adding regression keyword.

CC'ing erack for maybe thinking of an even more sophisticated DWIM-logic for a further 3.5 release or closing as WONTFIX by own judgeing.
Comment 4 Eike Rathke 2012-05-10 05:28:40 UTC
Setting the correct release version (here regression in 3.5.3) sometimes helps the developer to not waste time ...
Comment 5 Not Assigned 2012-05-10 07:23:05 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=803b5513eff8f8c185a91e91aee235dfab38d3bc

resolved fdo#46233 value >12 with AM/PM can't be clock time
Comment 6 Eike Rathke 2012-05-10 07:32:55 UTC
Damn, that was a fix for bug 47149 instead.
Comment 7 Rainer Bielefeld Retired 2012-05-10 07:36:03 UTC
@Eike:
"Bug 49639 - FILEOPEN html content .xls files shows text.csv with html tags instead of Spreadsheet contents" seems to be a DUP?!
Comment 8 Eike Rathke 2012-05-10 08:05:08 UTC
*** Bug 49639 has been marked as a duplicate of this bug. ***
Comment 9 Eike Rathke 2012-05-10 08:58:41 UTC
Actually bug 49639 is not a duplicate of this.
1. The submitter talks about difference between MS Word 2003 and
   LibreOffice 3.5, so this does not seem to be Calc.
2. Submitted on 2012-02-17 it can't be the regression of bug 49639
   introduced with 3.5.3
3. Bjoern's comment #3 is a wrong assumption then because in 3.5.0 the
   file wasn't imported as CSV.

=> Submitter should clarify what he actually expected.
Removing regression keyword, setting NEEDINFO, back to default owner.
Comment 10 Rainer Bielefeld Retired 2012-05-20 09:33:40 UTC
*** Bug 50046 has been marked as a duplicate of this bug. ***
Comment 11 Andrej 2012-05-31 01:26:45 UTC
Have the same problem.
My spreadsheet is generated as HTML on web server, it is given extension .xls.
This file can be automatically opened by MS Office, OpenOffice, and LibreOffice~<3.5.3. LibreOffice3.5.3 handles these files as plain text files.
Historically, spreadsheet applications can save spreadsheets in HTML format, and automatically open these files, if they are given .xls extension.
Comment 12 Eike Rathke 2012-06-01 02:45:26 UTC
Please don't change the Version field to newer, it indicates in which version the problem was first perceived.

Andrej, it seems your problem is a different one, as you indicated it worked for versions <3.5.3 it sounds pretty much like bug 49639, please check if release 3.5.4 fixes that for you.
Comment 13 Gergely Rácz 2014-07-07 12:54:23 UTC
I have experienced the same or similar problem. I have a generated .xls file with HTML content (see attached file: test_gracz.xls).

Environment:
- OS: win7
- LibO: 4.2.5.2

Steps to reproduce:
1. Open the attached file (test_gracz.xls)
2. Choose one of the option (it is irrelevant) from the "Import option" pop up window.

Result:
Few random character is generated in cell A1.

Expected result:
LibO should import the file correctly.
Comment 14 Gergely Rácz 2014-07-07 12:55:14 UTC
Created attachment 102372 [details]
test file by gracz@npsh.hu
Comment 15 Maxim Monastirsky 2014-07-07 13:52:27 UTC
(In reply to comment #13)
> I have experienced the same or similar problem.
No, that's another problem. Please open a new bug for it.

> Result:
> Few random character is generated in cell A1.
Those aren't "random" characters, but the UTF-8 BOM (See http://en.wikipedia.org/wiki/Byte_order_mark#UTF-8).

Actually this bug should be in NEEDINFO status, because the original reporter didn't answer Eike's question (comment 9).
Comment 16 QA Administrators 2015-02-19 04:33:53 UTC Comment hidden (obsolete)
Comment 17 QA Administrators 2015-04-01 14:51:41 UTC Comment hidden (obsolete)
Comment 18 Nicola Ruggero 2016-02-27 08:25:34 UTC
Hello,
the issue is still present in LO: 5.0.5.2-2.fc23

We expect that LO opens that files without the import dialog as for files like plain_text/html/csv/etc.

I will post the related screenshot
Comment 19 Nicola Ruggero 2016-02-27 08:28:10 UTC
Created attachment 123025 [details]
How the file is opened in LibreOffice 5

Here how the file is opened in LibreOffice 5.
An import dialog appears first, then it show the content like plain text.
Rendered layout and text format is completely different from MS Excel 2010
Comment 20 Nicola Ruggero 2016-02-27 08:29:06 UTC
Created attachment 123026 [details]
How the file is opened in MS Excel 2010

Here how the file is opened in MS Excel 2010.
Comment 21 Mikhail Vladimirov 2016-02-27 16:35:22 UTC
LibreOffice 5.1.0.3
Still the same ugly picture!
:-(
Comment 22 Mikhail Vladimirov 2016-03-03 12:24:13 UTC
Still Reproducible!

LibreOffice  5.1.1.2

:-(
Comment 23 mahfiaz 2016-04-17 16:14:16 UTC
The bank which saves HTML files and names them .xls is incompetent, although it's the lazy way to tell Excel to open it.
What puzzles me is why it gets opened in Word (original reporter in first comment).

Now if you really want to use that file, rename it to .html and open it with Writer (or skip the renaming part). If you for whatever reason really need the data in Calc, copy it over.

I'd say still the same ugly file.
Comment 24 Buovjaga 2016-11-02 20:20:55 UTC
Correcting status to NEW.

Created bug 103663 for crashing problem.
Comment 25 QA Administrators 2017-11-03 08:04:51 UTC Comment hidden (obsolete)
Comment 26 Mikhail Vladimirov 2017-11-06 18:41:40 UTC
The Bug Is Still Present!

LibreOffice
Version: 5.4.2.2
Build ID: 1:5.4.2~rc2-0ubuntu0.16.04.1~lo2
CPU threads: 8; OS: Linux 4.4; UI render: default; VCL: gtk2; 
Locale: en-US (en_US.UTF-8); Calc: group
Comment 27 Andrej 2017-11-08 07:26:56 UTC
Just tried on my "html3.2.xls", and document is opening.

LibreOffice
Version: 5.4.2.2 (x64)
Build ID: 22b09f6418e8c2d508a9eaf86b2399209b0990f4
CPU threads: 4; OS: Windows 6.1; UI render: default; 
Locale: lv-LV (lv_LV); Calc: group

It asks for "Import options", but otherwise it works fine.


If someone is still wondering "who may need such weird functionality?", i will tell the story:
Around the world, hundreds of thousands of low-salary coders/admins, like myself (who don't have skills nor time to generate valid .xls or .ods) create all kinds of reports for low-salary secretary/managers/accountants(who don't have skills nor time to import/copy data from html to spreadsheet). end of story.
Comment 28 Buovjaga 2017-11-10 13:40:11 UTC
(In reply to Andrej from comment #27)
> It asks for "Import options", but otherwise it works fine.

It's true. Let's close.

Arch Linux 64-bit, KDE Plasma 5
Version: 6.0.0.0.alpha1+
Build ID: 1aba1955f161cc112dab80b6b3e78ec7761616fc
CPU threads: 8; OS: Linux 4.13; UI render: default; VCL: kde4; 
Locale: fi-FI (fi_FI.UTF-8); Calc: group
Built on November 10th 2017