Bug 101317 - File saved with wrong mime type
Summary: File saved with wrong mime type
Status: RESOLVED WORKSFORME
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
5.2.0.4 release
Hardware: All All
: medium enhancement
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-08-05 06:54 UTC by Dmitrijs
Modified: 2022-07-26 11:34 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
Example files (17.07 KB, application/zip)
2016-08-05 06:54 UTC, Dmitrijs
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dmitrijs 2016-08-05 06:54:58 UTC
Created attachment 126606 [details]
Example files

I have a problem with xlsx file mime type. If I save new xlsx document with LibreOffice Calc and upload it to php, I can not get correct mime type. php always return that xslx file mime type is "application/octet-stream". If I do same thing with Microsoft Excel, I get correct mime type. If I save document as .ods, I get again correct mime type (application/vnd.oasis.opendocument.spreadsheet). So what is wrong with LibreOffice Calc and xslx?
Comment 1 Buovjaga 2016-08-06 20:07:59 UTC
Are you using Firefox to upload or..?
Comment 2 Dmitrijs 2016-08-08 06:50:43 UTC
In php I read mime type this way:

$finfo = new finfo(FILEINFO_MIME);
$type = $finfo->file('LibreOffice.xlsx');
var_dump($type);

It does not matter, what browser I use, because finfo read a file not information sent by browser. And most php framework file validations work this way. So it is not possible to validate LibreOffice xlsx files, because mime type always is "application/octet-stream".
Comment 3 Buovjaga 2016-08-08 09:47:20 UTC
Ok, I confirmed it with the PHP code.
Interestingly, this is a regression as 3.5 shows application/vnd.openxmlformats-officedocument.spreadsheetml.sheet

Win 7 Pro 64-bit Version: 5.3.0.0.alpha0+ (x64)
Build ID: f4d0818cd21f66b0d7f36f820fcf1b72e506e026
CPU Threads: 4; OS Version: Windows 6.1; UI Render: default; 
TinderBox: Win-x86_64@62-TDF, Branch:MASTER, Time: 2016-08-07_09:21:35
Locale: fi-FI (fi_FI); Calc: CL

LibreOffice 3.5.0rc3 
Build ID: 7e68ba2-a744ebf-1f241b7-c506db1-7d53735
Comment 4 Aron Budea 2016-08-09 03:12:14 UTC
I can't say I thoroughly understand this, but it seems mime-type detection for these file types is not precisely defined:
http://stackoverflow.com/questions/6595183/docx-file-type-in-php-finfo-file-is-application-zip
http://es1.php.net/manual/en/function.finfo-file.php

Nevertheless, if it was fine in a previous release, then returning to that state would be a relatively straighforward fix.
Comment 5 Dmitrijs 2016-08-09 06:22:30 UTC
Yes, docx file has same issue. Again, if I save it from Microsoft Word, finfo return "application/vnd.openxmlformats-officedocument.wordprocessingml.document". If I save file from LibreOffice Writer, finfo  return "application/octet-stream". But I can not repeat problem with mime type "application/zip". It, probably, was created with some script, and then it is another story.
Comment 6 Markus Mohrhard 2016-09-20 11:41:34 UTC
This is actually not a bug and therefore not a regression.

PHP most likely relies on some specific order of the files in the zip. However this is not a reliable way and not covered by the spec (in contrast to ODF).

I can agree that it would be a possible enhancement to see if it makes sense to switch to a file order that works with PHP but it is not a priority.
Comment 7 Daniel Quinn 2017-05-28 22:18:11 UTC
This is most definitely not a problem with PHP and I don't think it has anything to do with the order in which the files are compressed.

In Python:

    import magic

    magic.Magic(mime=True).from_file("test.xlsx")
    'application/octet-stream'


Or using `file` on the command line:

    $ file --mime-type test.xlsx 
    test.xlsx: application/octet-stream

This file was created using LibreOffice Calc 5.2.7.2
Comment 8 Andres Mosquera 2017-07-01 15:34:59 UTC
even when i faced this problem in PHP, if i use "file --mime-type filename.xlsx" it returns "application/octet-stream". it my POV this is a Libreoffice problem not only related to PHP.
Comment 9 Roberto Braga 2018-01-23 09:29:10 UTC
I can confirm the bug in version 5.4.2.2
Comment 10 Juanchi 2018-06-08 13:58:04 UTC
I can confirm that is still happening in version: 6.0.4.2 - Build: 1:6.0.4~rc2-0ubuntu0.16.04.1 with xslx files.

I'm using LibreOffice Calc on Linux Mint 18.1 - 64bit.

I've also had problems with a PHP mime-type validator and I can also output the wrong mime-type using the file cli program:

file -i test-file-oo6.xlsx 
test-file-oo6.xlsx: application/octet-stream; charset=binary

And if I run the same command against a Google Drive Spreadsheet file or MS Excel file saved as xslx the output is for the mime-type is: 

application/vnd.openxmlformats-officedocument.spreadsheetml.sheet; charset=binary
Comment 11 Stéphane Guillou (stragu) 2021-06-22 13:05:13 UTC
reproduced using the file -i command on Ubuntu 18.04 on an XLSX file created with:

Version: 7.3.0.0.alpha0+ / LibreOffice Community
Build ID: e3086b58eb5427d520b86c185f9d911bb6f7a3a0
CPU threads: 8; OS: Linux 4.15; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
TinderBox: Linux-rpm_deb-x86_64@86-TDF, Branch:master, Time: 2021-06-21_15:37:11
Calc: threaded

I get:

Untitled 1.xlsx: application/octet-stream; charset=binary
Comment 12 Gabor Kelemen (allotropia) 2022-07-26 11:34:17 UTC
I ran into this problem, looks like it started in 4.2.

But: looks like this issue was worked around by upstream file maintainers.
Most probably by commit https://github.com/file/file/commit/cea6359e8aad609573076b189dd58361e52bc715

So Ubuntu 18.04 with file 5.32 is affected, but Ubuntu 20.04 with file 5.38 is not anymore. Solution is to upgrade the OS used.

I think this can be closed as WFM.