Bug 38361 - FILEOPEN: Excel 2003 XML file containing   cannot be opened
Summary: FILEOPEN: Excel 2003 XML file containing   cannot be opened
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
3.4.0 release
Hardware: All All
: medium enhancement
Assignee: Not Assigned
URL:
Whiteboard: Confirmed:4.1.3.2:OSX10.9 Confirmed:4...
Keywords:
: 63989 (view as bug list)
Depends on:
Blocks: MSO-XML2003
  Show dependency treegraph
 
Reported: 2011-06-16 00:06 UTC by Yogurt
Modified: 2018-02-19 14:20 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
The buggy file (23.88 KB, application/xml)
2011-06-16 00:07 UTC, Yogurt
Details
MS-Excel xml file (23.86 KB, application/xml)
2011-07-18 00:33 UTC, Jean-Baptiste Faure
Details
Real world MS Excel XML file, LO cannot open. This is incoming invoice, it's not known how is it sourced. (111.07 KB, application/vnd.ms-excel)
2013-07-17 12:03 UTC, Anton Derbenev
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Yogurt 2011-06-16 00:06:13 UTC
The attached file opens fine in MS Excel 2007.

Problem #1: One gets only an empty spreadsheet in LibO Calc.
Problem #2: There is no error or warning alert.

So LibO should either open this file properly, or at least give a feedback to the user why it fails to open.

My first guess for the first problem is that LibO Calc fails because of the Google Maps links in some cells. They also work perfectly in MS Excel.


LibreOffice 3.4.0 
OOO340m1 (Build:12)
with Hungarian Pack
Windows 7 x64 English
Comment 1 Yogurt 2011-06-16 00:07:24 UTC
Created attachment 48024 [details]
The buggy file

Bugzilla failed to attach this file with the original post.
Comment 2 Rainer Bielefeld Retired 2011-07-16 00:57:39 UTC
[Reproducible] with reporter's sample and "LibreOffice 3.4.1  - WIN7  Home Premium (64bit) German UI [OOO340m1 (Build:103)]": Document will be opened, but without shown contents.

CRASH when trying to open with Master "LibO-dev 3.4.5  – WIN7  Home Premium  (64bit) English UI 
[(Build ID:d337f79-a24c961-2865670-9752b71-7f8fd43
	2fdd60d-fd28b6a-fd7bf20-aa369cb-28da3fb
	6a9633a-931d089-ecd263f-c9b55e9-b31b807
	82ff335-599f7e9-bc6a545-1926fdf)]"

EXCEL 2010 opens document without problems and shows contents.

@Yogurt
To decide whether this is a bug report or an enhancement request we need some more information concerning the document. LibO should support EXCEL2003 XML, But I only see "some XML" in the document. Can you supply more information?

Without that information this is a "Should show warning when unknown XML" bug
Comment 3 Jean-Baptiste Faure 2011-07-17 21:53:00 UTC
The problem was already in LibO 3.3.3 under Ubuntu 10.04 x86_64.

What is the original extension of the bugdoc? When I download the file I get a file named attachment.cgi. And if I click on the link in this page I get an xml parser error message. 

Best regards. JBF
Comment 4 Rainer Bielefeld Retired 2011-07-17 23:00:45 UTC
@Jean-Baptiste Faure:
Details Attachment view shows original name "1a.xml".
Comment 5 Jean-Baptiste Faure 2011-07-18 00:27:37 UTC
@Rainer : thank you! :-)

In fact the bugdoc has really several xml syntax errors: if you click on the link in Firefox it says (sorry in French for me):
--->
Erreur d'analyse XML : entité non définie
Emplacement : https://bugs.freedesktop.org/attachment.cgi?id=48024
Numéro de ligne 601, Colonne 50 :    <Cell ss:StyleID="n0"><Data ss:Type="String">&nbsp;</Data></Cell>
-------------------------------------------------^
<---
The problem is with the "&nbsp;".
I tried what follows:
1/ fix the xml syntax of 1a.xml by removing the &nbsp; between keywords
2/ in LibO 3.4.1 : File > Open > choose the xml file
Then LibO Calc opens it and show 1 sheet with 3 rows. The sheet is named "Várakozási".

Best regards. JBF
Comment 6 Jean-Baptiste Faure 2011-07-18 00:33:06 UTC
Created attachment 49234 [details]
MS-Excel xml file

same as 1a.xml with xml syntax errors fixed
Comment 7 Rainer Bielefeld Retired 2011-07-18 01:30:48 UTC
Yes, seems so. I used a text editor to replace all "&nbsp;" by "", then the document was opened with LibO without problems.

So the result seems to suggest a modified subject "improve fault-tolerance concerning xml-syntax-errors (&nbsp;)"?
Comment 8 Jean-Baptiste Faure 2011-07-18 01:42:14 UTC
(In reply to comment #7)

> So the result seems to suggest a modified subject "improve fault-tolerance
> concerning xml-syntax-errors (&nbsp;)"?

I agree. 
JBF
Comment 9 Yogurt 2011-08-01 06:48:50 UTC
(In reply to comment #7)
> Yes, seems so. I used a text editor to replace all "&nbsp;" by "", then the
> document was opened with LibO without problems.
> 
> So the result seems to suggest a modified subject "improve fault-tolerance
> concerning xml-syntax-errors (&nbsp;)"?

Sorry, I've gone to holidays just the day you've commented my report.

You are right, &nbsp; is in fact not a valid XML entity. I got this file from loading an HTML table and saving it as an Excel 2003 XML. (The HTML table had it in "empty" cells, as it is used wide-spread.)

So my suggestion is that LibO should display an error message saying it's an invalid file, instead of just creating an empty sheet.
Comment 10 Björn Michaelsen 2011-12-23 13:23:41 UTC Comment hidden (no-value)
Comment 11 ign_christian 2013-06-17 06:47:25 UTC
Using LO 4.0.4.2 (Win7 32bit) generates error message:
General Error.
General input/output error.
Comment 12 Anton Derbenev 2013-07-17 12:03:57 UTC
Created attachment 82541 [details]
Real world MS Excel XML file, LO cannot open. This is incoming invoice, it's not known how is it sourced.

another real-world example. LO 4.0.4.2 cannot open the file, saying «General Error. General input/output error.». Tested on Windowses 7 32-bit, 8 64-bit.
Comment 13 Maxim Monastirsky 2013-09-27 10:34:26 UTC
*** Bug 63989 has been marked as a duplicate of this bug. ***
Comment 14 retired 2013-12-10 09:21:34 UTC
General Input/Output Error on OS X 10.9

LO 4.1.3.2 and
LO Version: 4.3.0.0.alpha0+
Build ID: 5e01904de993caa3d497a8f6c82a846336e70eef
TinderBox: MacOSX-x86@49-TDF, Branch:master, Time: 2013-12-06_02:05:01

Adding confirmed whiteboard status and OS > ALL.
Comment 15 Jan Kratochvíl 2014-08-13 10:19:29 UTC
Experiencing this problem in 4.3.0.4/W7

When opening EXCEL XML error ocurs: General Input/Output Error

Do you need any more information?
Comment 16 retired 2014-08-13 10:27:00 UTC
Confirmed in LO 4.3.1RC1
Comment 17 MM 2016-08-06 19:47:56 UTC
Still confirmed with v5.2.0.4 under ubuntu 16.04 x64.
Comment 18 Kohei Yoshida 2018-02-19 14:20:43 UTC
The latest master branch build, which will become 6.1.0 at some point, can open this file.  The cells with &nbsp; are opened as cells containing "nbsp;", which is not exactly what Excel does, but given it's not a valid XML I think it's good enough.