Bug 64395 - FILEOPEN XLSX file format not recognized
Summary: FILEOPEN XLSX file format not recognized
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
(earliest affected)
3.5.4 release
Hardware: All All
: medium normal
Assignee: Not Assigned
: 69230 (view as bug list)
Depends on:
Reported: 2013-05-09 14:56 UTC by gustavo
Modified: 2014-08-07 06:17 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:
Regression By:

original and converted files. (24.37 KB, application/x-gzip)
2013-05-09 14:56 UTC, gustavo

Note You need to log in before you can comment on or make changes to this bug.
Description gustavo 2013-05-09 14:56:28 UTC
Created attachment 79056 [details]
original and converted files.

We are seeing a 3rd pary exported XLSX file that can't be opened by Libreoffice, nor by Google Docs. However it can be converted with Zamzar.com and Office365.

I'm attaching to this bug an archive with both the orgiginal file and conversions from these two services.

I'm using Libreoffice on Ubuntu 12.04.
Comment 1 Urmas 2013-05-10 05:12:22 UTC
It appears to be a ZIP format issue.

If you want to report wrong fonts, lack of borders or cell background, that needs to be done in a separate bug.
Comment 2 gustavo 2013-05-10 08:15:13 UTC
For now I was just reporting that the file can't even be opened. O don't know if there are other problems with the original file since it can't be opened. I can only open the converted ones.

Regarding zip: I tried unzipping by hand with the Linux unzip command and it works.
Comment 3 Urmas 2013-05-11 05:04:43 UTC
And if you zip it back, does it open after that?
Comment 4 gustavo 2013-05-11 10:58:03 UTC
Yes, this makes it work:

unzip 41.xlsx
mv 41.xlsx /tmp/
zip -r 41.xlsx *

Ark can also read and extract the contents without any problem.

The unzip command issues this warning:

warning:  41.xlsx appears to use backslashes as path separators

unzip -l returns this:

Archive:  41.xlsx
  Length     Date   Time    Name
 --------    ----   ----    ----
      303  05-09-13 14:33   _rels\.rels
      936  05-09-13 14:33   [Content_Types].xml
      320  05-09-13 14:33   xl\workbook.xml
     1982  05-09-13 14:33   xl\sharedStrings.xml
    12099  05-09-13 14:33   xl\styles.xml
     5573  05-09-13 14:33   xl\worksheets\sheet0.xml
      590  05-09-13 14:33   xl\_rels\workbook.xml.rels
 --------                   -------
    21803                   7 files

After I unzip and zip again the result of unzip -l is:

Archive:  41.xlsx
  Length     Date   Time    Name
 --------    ----   ----    ----
      936  05-09-13 14:33   [Content_Types].xml
        0  05-11-13 11:49   _rels/
      303  05-09-13 14:33   _rels/.rels
        0  05-11-13 11:49   xl/
    12099  05-09-13 14:33   xl/styles.xml
      320  05-09-13 14:33   xl/workbook.xml
     1982  05-09-13 14:33   xl/sharedStrings.xml
        0  05-11-13 11:49   xl/worksheets/
     5573  05-09-13 14:33   xl/worksheets/sheet0.xml
        0  05-11-13 11:49   xl/_rels/
      590  05-09-13 14:33   xl/_rels/workbook.xml.rels
 --------                   -------
    21803                   11 files

We see that the slashes are changed and that there are now individual entries per directory.

This is clearly an interoperability problem. Now, the question is who is to blame. Is it LO that doesn't properly handle these files? Or are these files not conforming to the specification?

Can someone from the team double check what is written here about slashes?
Comment 5 Maxim Monastirsky 2014-08-07 06:14:38 UTC
(In reply to comment #4)
> Can someone from the team double check what is written here about slashes?
> http://www.pkware.com/documents/APPNOTE/APPNOTE_6.2.0.txt
It clearly says that all slashes should be forward slashes.
Comment 6 Maxim Monastirsky 2014-08-07 06:17:34 UTC
*** Bug 69230 has been marked as a duplicate of this bug. ***