Created attachment 126606 [details] Example files I have a problem with xlsx file mime type. If I save new xlsx document with LibreOffice Calc and upload it to php, I can not get correct mime type. php always return that xslx file mime type is "application/octet-stream". If I do same thing with Microsoft Excel, I get correct mime type. If I save document as .ods, I get again correct mime type (application/vnd.oasis.opendocument.spreadsheet). So what is wrong with LibreOffice Calc and xslx?
Are you using Firefox to upload or..?
In php I read mime type this way: $finfo = new finfo(FILEINFO_MIME); $type = $finfo->file('LibreOffice.xlsx'); var_dump($type); It does not matter, what browser I use, because finfo read a file not information sent by browser. And most php framework file validations work this way. So it is not possible to validate LibreOffice xlsx files, because mime type always is "application/octet-stream".
Ok, I confirmed it with the PHP code. Interestingly, this is a regression as 3.5 shows application/vnd.openxmlformats-officedocument.spreadsheetml.sheet Win 7 Pro 64-bit Version: 5.3.0.0.alpha0+ (x64) Build ID: f4d0818cd21f66b0d7f36f820fcf1b72e506e026 CPU Threads: 4; OS Version: Windows 6.1; UI Render: default; TinderBox: Win-x86_64@62-TDF, Branch:MASTER, Time: 2016-08-07_09:21:35 Locale: fi-FI (fi_FI); Calc: CL LibreOffice 3.5.0rc3 Build ID: 7e68ba2-a744ebf-1f241b7-c506db1-7d53735
I can't say I thoroughly understand this, but it seems mime-type detection for these file types is not precisely defined: http://stackoverflow.com/questions/6595183/docx-file-type-in-php-finfo-file-is-application-zip http://es1.php.net/manual/en/function.finfo-file.php Nevertheless, if it was fine in a previous release, then returning to that state would be a relatively straighforward fix.
Yes, docx file has same issue. Again, if I save it from Microsoft Word, finfo return "application/vnd.openxmlformats-officedocument.wordprocessingml.document". If I save file from LibreOffice Writer, finfo return "application/octet-stream". But I can not repeat problem with mime type "application/zip". It, probably, was created with some script, and then it is another story.
This is actually not a bug and therefore not a regression. PHP most likely relies on some specific order of the files in the zip. However this is not a reliable way and not covered by the spec (in contrast to ODF). I can agree that it would be a possible enhancement to see if it makes sense to switch to a file order that works with PHP but it is not a priority.
This is most definitely not a problem with PHP and I don't think it has anything to do with the order in which the files are compressed. In Python: import magic magic.Magic(mime=True).from_file("test.xlsx") 'application/octet-stream' Or using `file` on the command line: $ file --mime-type test.xlsx test.xlsx: application/octet-stream This file was created using LibreOffice Calc 5.2.7.2
even when i faced this problem in PHP, if i use "file --mime-type filename.xlsx" it returns "application/octet-stream". it my POV this is a Libreoffice problem not only related to PHP.
I can confirm the bug in version 5.4.2.2
I can confirm that is still happening in version: 6.0.4.2 - Build: 1:6.0.4~rc2-0ubuntu0.16.04.1 with xslx files. I'm using LibreOffice Calc on Linux Mint 18.1 - 64bit. I've also had problems with a PHP mime-type validator and I can also output the wrong mime-type using the file cli program: file -i test-file-oo6.xlsx test-file-oo6.xlsx: application/octet-stream; charset=binary And if I run the same command against a Google Drive Spreadsheet file or MS Excel file saved as xslx the output is for the mime-type is: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet; charset=binary
reproduced using the file -i command on Ubuntu 18.04 on an XLSX file created with: Version: 7.3.0.0.alpha0+ / LibreOffice Community Build ID: e3086b58eb5427d520b86c185f9d911bb6f7a3a0 CPU threads: 8; OS: Linux 4.15; UI render: default; VCL: gtk3 Locale: en-AU (en_AU.UTF-8); UI: en-US TinderBox: Linux-rpm_deb-x86_64@86-TDF, Branch:master, Time: 2021-06-21_15:37:11 Calc: threaded I get: Untitled 1.xlsx: application/octet-stream; charset=binary
I ran into this problem, looks like it started in 4.2. But: looks like this issue was worked around by upstream file maintainers. Most probably by commit https://github.com/file/file/commit/cea6359e8aad609573076b189dd58361e52bc715 So Ubuntu 18.04 with file 5.32 is affected, but Ubuntu 20.04 with file 5.38 is not anymore. Solution is to upgrade the OS used. I think this can be closed as WFM.