Bug 51345 - FILEOPEN: x:num attribute is not handled while importing HTML files created by Excel 2003
Summary: FILEOPEN: x:num attribute is not handled while importing HTML files created b...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
3.4.4 release
Hardware: Other All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: BSA
Keywords:
: 89939 (view as bug list)
Depends on:
Blocks: HTML-Import
  Show dependency treegraph
 
Reported: 2012-06-22 11:34 UTC by Marek Ozana
Modified: 2023-12-17 03:12 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments
Excel file with list of companies and their respective financial data (233.56 KB, application/vnd.ms-excel)
2012-06-22 11:34 UTC, Marek Ozana
Details
screenshot (141.55 KB, image/png)
2012-06-29 23:55 UTC, Julien Nabet
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Marek Ozana 2012-06-22 11:34:02 UTC
Created attachment 63358 [details]
Excel file with list of companies and their respective financial data

Problem description: 
When opening the excel file "REON-Table.xls" the sheet is empty. No error message. The same excel file contains numbers and text when opened in MS Excel.
Please find the file in attachment.

Steps to reproduce:
1. Start LibreOffice Calc
2. File->OPen
3. Select REON-Teble.xls

Current behavior:
progress bar shows when opening file. then no data are displayed

Expected behavior:
to show the data available in the file

Platform (if different from the browser): Ubuntu 10.10
              
Browser: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:13.0) Gecko/20100101 Firefox/13.0.1
Comment 1 Urmas 2012-06-23 08:18:22 UTC
In 3.5, cells with numbers are still not imported.
Comment 2 Julien Nabet 2012-06-29 23:55:43 UTC
Created attachment 63628 [details]
screenshot

On pc Debian x86-64, with master sources (future 3.7) updated today.

LO asked about Import Options (Select the language : automatic or Custom, I chose Automatic) + Detect Special Numbers (I tried with option unchecked then checked, same result)
Then I got the result of the screenshot.

Did you have the same result ?
Comment 3 Marek Ozana 2012-07-02 10:09:54 UTC
I get just empty sheet in Ubuntu 11.10, LibreOffice 3.4.4.
Even the screenshot (attachment 63358 [details]) shows incomplete import since the table is filled with numbers for all columns and rows in Excel
Comment 4 Mirosław Zalewski 2013-03-09 16:57:52 UTC
In fact, Calc behaves correctly.

This XLS file is really HTML. It contains one huge table and has empty <td> tags (table cells) where numbers ought to be. Some moron at Microsoft decided, that instead of exporting numbers to <td> content (so any HTML-compliant app could read them), they will write them in x:num attribute.

Of course x:num is NOT correct HTML attribute and Calc - as every good-behaving user agent should - ignores them.

You may try downloading file attached by Marek Ozana, renaming it to "REON-Tables.html" and opening in web browser. Table will be mostly empty, as in Calc.

So, while Calc behavior is correct and expected, it can lead to interoperability problem. This particular file declares to be created by MS Office Excel 2003 (which is rather old), but:
a) who knows how many "XLS" files like this are there on the wild
b) who knows whether newer Excel versions are saner

I am changing title of this bug, so it will show point of this bug more accurately.
Comment 5 Akhilesh 2013-04-27 08:22:05 UTC
We are having quite a few issues because our bank refuses to upgrade their Office or use any format other than xls (we suggested csv)...

Can you suggest some work around? Can I write a plugin that taps into the file when it is being open, to extract the xnum attribute as a value?
Comment 6 Julien Nabet 2013-05-01 17:41:16 UTC
Kohei/Markus/Eike: would 1 of you have some time to give his opinion about this?
Is it a bug, an enhancement?
Comment 7 Urmas 2013-05-02 19:14:59 UTC
There's also "x:fmla" attribute which isn't imported too.
Comment 8 Markus Mohrhard 2013-05-02 20:28:15 UTC
(In reply to comment #5)
> We are having quite a few issues because our bank refuses to upgrade their
> Office or use any format other than xls (we suggested csv)...
> 
> Can you suggest some work around? Can I write a plugin that taps into the
> file when it is being open, to extract the xnum attribute as a value?

You can fix it in the Libreoffice source code. If you are interested I'll add some code pointers. But be warned that our html parser is one of the worst in the world and it might be easier to write a clean new one based on orcus interfaces. Actually this is a plan for some time now but we just don't have enough time for all this work so it would be awesome if someone here would help.
Comment 9 QA Administrators 2015-03-04 02:19:58 UTC Comment hidden (obsolete)
Comment 10 Buovjaga 2015-03-20 17:20:09 UTC
*** Bug 89939 has been marked as a duplicate of this bug. ***
Comment 11 tommy27 2016-04-16 07:26:39 UTC Comment hidden (obsolete)
Comment 12 QA Administrators 2017-05-22 13:22:43 UTC Comment hidden (obsolete)
Comment 13 Kohei Yoshida 2018-11-21 01:07:21 UTC
The file in question is an HTML file, not MSO-XML2003 file.
Comment 14 QA Administrators 2019-11-22 03:41:04 UTC Comment hidden (obsolete)
Comment 15 Kevin Suo 2021-10-15 13:14:38 UTC
Well, I submitted a patch for the x:str tag attribute, for bug 96499:
https://gerrit.libreoffice.org/c/core/+/123620
but I then abandoned it myself because I thought it was a misunderstanding of the x:str.

Now I see this x:num and x:fmla you are discussing here. Is there any documentation show the usage of these attributes? Are there any other attribus like this?

Should it be x:str="somestring", or just a blank attribute? What's inside x:num?

Without this kind of knowledge it hard to implement correctly.
Comment 16 QA Administrators 2023-12-17 03:12:50 UTC
Dear Marek Ozana,

To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year.

There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present.

If you have time, please do the following:

Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/

If the bug is present, please leave a comment that includes the information from Help - About LibreOffice.
 
If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice.

Please DO NOT

Update the version field
Reply via email (please reply directly on the bug tracker)
Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not 
appropriate in this case)


If you want to do more to help you can test to see if your issue is a REGRESSION. To do so:
1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from https://downloadarchive.documentfoundation.org/libreoffice/old/

2. Test your bug
3. Leave a comment with your results.
4a. If the bug was present with 3.3 - set version to 'inherited from OOo';
4b. If the bug was not present in 3.3 - add 'regression' to keyword


Feel free to come ask questions or to say hello in our QA chat: https://web.libera.chat/?settings=#libreoffice-qa

Thank you for helping us make LibreOffice even better for everyone!

Warm Regards,
QA Team

MassPing-UntouchedBug