Bug 64423 - Import of XML Excel file into Calc fails with General input/output error.
Summary: Import of XML Excel file into Calc fails with General input/output error.
Status: RESOLVED WORKSFORME
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
3.5.0 release
Hardware: All All
: medium major
Assignee: Not Assigned
URL:
Whiteboard: Repro:5.1
Keywords: bibisectRequest, regression
: 65180 (view as bug list)
Depends on:
Blocks:
 
Reported: 2013-05-10 10:20 UTC by Liam Smit
Modified: 2018-04-01 13:12 UTC (History)
8 users (show)

See Also:
Crash report or crash signature:


Attachments
Excel XML file that causes import error (12.27 KB, application/gzip)
2013-05-10 10:20 UTC, Liam Smit
Details
Adding second Excel XML file that Calc can't import / open (86.25 KB, application/gzip)
2013-05-10 16:57 UTC, Liam Smit
Details
XML file that won't open (26.54 KB, application/zip)
2013-05-30 19:52 UTC, Kevin
Details
Errors written to CLI when opening .XML file (1.13 KB, application/gzip)
2013-10-05 14:45 UTC, Liam Smit
Details
Workaround (2.61 KB, application/xml)
2018-02-12 20:46 UTC, Everton Gomes
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Liam Smit 2013-05-10 10:20:53 UTC
Created attachment 79083 [details]
Excel XML file that causes import error

Overview:

    When attempting to import the attached .XML file (generated by MS Excel) 
    the import fails and the following message is displayed:

    General Error
    General input/output error.


Steps to Reproduce:

    1.) Uncompress / extract the attached file (I compressed it to upload it)

    2.) Attempt to open the uncompressed file .xml.


Actual Results:

The import / open fails and Error message is displayed:

    General Error
    General input/output error.

 
Expected Results: 

    It should import and open the spreadsheet in Calc.


Build Date & Platform: 

    This bug was noticed as far back to 3.5 and 3.6.


Additional Builds and Platforms:

    This problem also occurs with Libre Office 4.0.2 on Windows 7 Professional.


Additional Information:

    It is not only this one spreadsheet file. I can provide additional example files if required.
Comment 1 Joel Madero 2013-05-10 16:51:43 UTC
Updating version as you've said it existed in 3.5 -

Version field reflects the oldest version that we see the bug, we use comments to say "also verified on version...." - this is to track if a bug was introduced (ie. regression) at some later date.

I can confirm this:
Bodhi Linux
LibreOffice version 4.0.2.2 release

Marking as:
New (confirmed)
Major - cannot open the file at all
Medium - default is high but this is one test file, it could be a bigger issue but as of now we have a single test case.
Comment 2 Liam Smit 2013-05-10 16:57:33 UTC
Created attachment 79109 [details]
Adding second Excel XML file that Calc can't import / open

This is another XML file produced by Excel that Calc can not open.

I could provide more but I have a very strong suspicion that there is a common cause.

If a cause is found I would be more than willing to provide additional files to confirm that it is the same problem encountered with multiple files.
Comment 3 Joel Madero 2013-05-10 17:15:14 UTC
Thanks for the additional info - likely a filters issue but going to add Kohei to see if he has any tips or wants to tackle it.

Kohei - thoughts on this one?
Comment 4 Kohei Yoshida 2013-05-10 19:55:11 UTC
This is an Excel 2003 XML format that only Excel 2003 generated. The later versions all moved to xlsx.  We don't even have a real filter for this (only one based on XSLT).  So, it's very unlikely that we would spend serious effort into fixing this.

Check with Peter (CC'ed) to see if he is interested. He's shown interest with XSLT based filters (and especially the Excel 2003 XML filter) in the past.
Comment 5 Kevin 2013-05-30 19:46:20 UTC
*** Bug 65180 has been marked as a duplicate of this bug. ***
Comment 6 Kevin 2013-05-30 19:52:52 UTC
Created attachment 80066 [details]
XML file that won't open
Comment 7 Liam Smit 2013-07-27 21:02:51 UTC
I'm not sure if it's useful but we parse these files to extract and work on the contents. 

I believe this is done using Perl possibly along the lines of:

http://games.greggman.com/game/excel_perl_xml/

And:

http://search.cpan.org/~jmcnamara/Spreadsheet-ParseExcel-0.59/lib/Spreadsheet/ParseExcel.pm


If it would help I can find out more in which case let me know.
Comment 8 Commit Notification 2013-07-28 00:09:21 UTC
Kohei Yoshida committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=b46688a663b8709e0e0795f25ef8961db1f46cba

fdo#64423: Detect BIFF 2 (and 3) file format like we should.



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 9 Kohei Yoshida 2013-07-28 00:15:11 UTC
Scratch that. I put the wrong bugzilla number.
Comment 10 Liam Smit 2013-10-05 14:45:15 UTC
Created attachment 87162 [details]
Errors written to CLI when opening .XML file

When launching soffice binary from the command line and then attempting to open the file 7-Divisions-HUCS.xml there are various errors that are written out to the command line terminal e.g.:

runtime error: file file:///usr/lib/libreoffice/program/../share/xslt/import/spreadsheetml/spreadsheetml2ooo.xsl line 6131 element param
xsltApplyXSLTTemplate: A potential infinite template recursion was detected.
You can adjust xsltMaxDepth (--maxdepth) in order to raise the maximum number of nested template calls and variables/params (currently set to 3000).
Comment 11 Liam Smit 2013-10-05 15:04:18 UTC
I took a look at the referenced bug, 38361, but there are no   tags in the files I supplied. There are however & and " tags in it although I'm not sure if this makes any difference:

$ grep "&" 7-Divisions-HUCS.xml 
     x:Data="&C&"Times New Roman,Regular"&12&A"/>
     x:Data="&C&"Times New Roman,Regular"&12Page &P"/>
    <Cell ss:StyleID="s81"><Data ss:Type="String">CallForwardOnNotRegisteredToVoiceMail&#13;</Data></Cell>
    <Cell ss:StyleID="s81"><Data ss:Type="String">CallForwardOnNotRegisteredToVoiceMail&#13;</Data></Cell>
     x:Data="&amp;C&amp;&quot;Times New Roman,Regular&quot;&amp;12&amp;A"/>
     x:Data="&amp;C&amp;&quot;Times New Roman,Regular&quot;&amp;12Page &amp;P"/>
     x:Data="&amp;C&amp;&quot;Times New Roman,Regular&quot;&amp;12&amp;A"/>
     x:Data="&amp;C&amp;&quot;Times New Roman,Regular&quot;&amp;12Page &amp;P"/>
     x:Data="&amp;C&amp;&quot;Times New Roman,Regular&quot;&amp;12&amp;A"/>
     x:Data="&amp;C&amp;&quot;Times New Roman,Regular&quot;&amp;12Page &amp;P"/>
        xmlns="http://www.w3.org/TR/REC-html40">&#10;</Font></Data></Comment></Cell>
     x:Data="&amp;C&amp;&quot;Times New Roman,Regular&quot;&amp;12&amp;A"/>
     x:Data="&amp;C&amp;&quot;Times New Roman,Regular&quot;&amp;12Page &amp;P"/>
    <Cell ss:StyleID="s67"><Data ss:Type="String">Note: Call Agent is optional when multi-tenant = &quot;true&quot; and mandatory when multi-tenant = &quot;false&quot;.</Data></Cell>
    <Cell ss:StyleID="s104"><Data ss:Type="String">EXT:100&#10;</Data></Cell>
    <Cell ss:StyleID="s104"><Data ss:Type="String">EXT:100&#10;</Data></Cell>
     x:Data="&amp;C&amp;&quot;Times New Roman,Regular&quot;&amp;12&amp;A"/>
     x:Data="&amp;C&amp;&quot;Times New Roman,Regular&quot;&amp;12Page &amp;P"/>
     x:Data="&amp;C&amp;&quot;Times New Roman,Regular&quot;&amp;12&amp;A"/>
     x:Data="&amp;C&amp;&quot;Times New Roman,Regular&quot;&amp;12Page &amp;P"/>
     x:Data="&amp;C&amp;&quot;Times New Roman,Regular&quot;&amp;12&amp;A"/>
     x:Data="&amp;C&amp;&quot;Times New Roman,Regular&quot;&amp;12Page &amp;P"/>
    <Cell ss:StyleID="s104"><Data ss:Type="String">#&#10;</Data></Cell>
     x:Data="&amp;C&amp;&quot;Times New Roman,Regular&quot;&amp;12&amp;A"/>
     x:Data="&amp;C&amp;&quot;Times New Roman,Regular&quot;&amp;12Page &amp;P"/>
     x:Data="&amp;C&amp;&quot;Times New Roman,Regular&quot;&amp;12&amp;A"/>
     x:Data="&amp;C&amp;&quot;Times New Roman,Regular&quot;&amp;12Page &amp;P"/>
     x:Data="&amp;C&amp;&quot;Times New Roman,Regular&quot;&amp;12&amp;A"/>
     x:Data="&amp;C&amp;&quot;Times New Roman,Regular&quot;&amp;12Page &amp;P"/>
     x:Data="&amp;C&amp;&quot;Times New Roman,Regular&quot;&amp;12&amp;A"/>
     x:Data="&amp;C&amp;&quot;Times New Roman,Regular&quot;&amp;12Page &amp;P"/>
     x:Data="&amp;C&amp;&quot;Times New Roman,Regular&quot;&amp;12&amp;A"/>
     x:Data="&amp;C&amp;&quot;Times New Roman,Regular&quot;&amp;12Page &amp;P"/>
     x:Data="&amp;C&amp;&quot;Times New Roman,Regular&quot;&amp;12&amp;A"/>
     x:Data="&amp;C&amp;&quot;Times New Roman,Regular&quot;&amp;12Page &amp;P"/>
     x:Data="&amp;C&amp;&quot;Times New Roman,Regular&quot;&amp;12&amp;A"/>
     x:Data="&amp;C&amp;&quot;Times New Roman,Regular&quot;&amp;12Page &amp;P"/>
liam@liam-544:~$
Comment 12 Urmas 2013-10-05 20:04:22 UTC
The problem lies in some worksheets defining about 16200 columns. The current implementation supports only about 3000 columns.
Comment 13 QA Administrators 2015-04-19 03:23:30 UTC Comment hidden (obsolete)
Comment 14 Buovjaga 2015-06-19 13:57:23 UTC
Still erroring.

Win 7 Pro 64-bit Version: 5.1.0.0.alpha1+
Build ID: 437210d58f32177ef1829d704f7f4d2f1bbfbfdd
TinderBox: Win-x86@39, Branch:master, Time: 2015-06-18_07:21:56
Locale: fi-FI (fi_FI)
Comment 15 Liam Smit 2015-06-19 15:59:49 UTC
Also occurs in:
LibreOffice 4.4.3

Running on:
Ubuntu 14.04 64 bit
Comment 16 Timur 2015-07-27 12:23:21 UTC
Regression since it was opening fine with LO 3.3.4. Cannot be open with LO 3.4.
Comment 17 Xisco Faulí 2016-09-20 16:10:34 UTC
Adding keyword 'bibisectRequest' to see whether this regression is already present in the oldest build of bibisect-43all repository or not.
In case it's already present, change 'bibisectRequest' to 'preBibisect'.
Otherwise, change 'bibisectRequest' to 'bibisected' and add a comment with the output from 'git bisect log'
Comment 18 BugzillaNemo 2017-08-23 19:16:25 UTC
Also occurs in:
LibreOffice 5.3.5.2 and 5.4.0.2

Running on:
Arch as of 2017-08-22.
Comment 19 Everton Gomes 2018-02-12 20:46:45 UTC
Created attachment 139843 [details]
Workaround

Replace file "/usr/lib/libreoffice/share/xslt/import/spreadsheetml/spreadsheetml2ooo.xsl" by this version; or just appoint Calc to it (Tools > XML Filter Settings... > MS Excel 2003 XML > Edit... > Transformation > XSLT for import: > Browse...)
Comment 20 eisa01 2018-04-01 13:12:09 UTC
Works fine in current master

Version: 6.1.0.0.alpha0+
Build ID: a488c7ad2763b944713997911c1ddb0315d8c93f
CPU threads: 2; OS: Mac OS X 10.12.6; UI render: default; 
TinderBox: MacOSX-x86_64@49-TDF, Branch:master, Time: 2018-03-26_00:38:29
Locale: en-US (en_US.UTF-8); Calc: group