A Word document in XML format is not recognized.
attach an example of what you mean
Created attachment 48767 [details]
Work DOC file in clipboard/xml format. Not sure what the correct name is. This isn't OOXML, it's something between binary DOC and OOXML (which is a ZIP file).
PS: I did attach this file in the original report but for some reason, it didn't make it into the database.
Could you rename your file as .xml instead of .doc and test again. Seems to load correctly over here when renaming.
Added screenshot of output.
Created attachment 48789 [details]
screenshot of testMultiPage, renamed as .xml
I can use renaming it to .xml as a workaround.
Can you please enhance the loading code to try the ".xml" loader, too, when it sees ".doc"? Maybe just look for "<?mso-application progid="Word.Document"?>" in the first 1024 bytes.
Might not be perfect but should work until someone finds another example where it breaks.
I think here the program tries to load a doc file first, when not recognised trying to load as plain text.
What you want (correct me if i'm wrong); Load doc. When not a doc file, try to load as xml. If not xml, try to load as plain text.
should just be a matter of telling the Office 2003 xml importer that .doc (and .xls for the excel one) are also acceptable suffixes
Please also add xslx and docx (OOXML extensions)
There are known uses of .docx as a suffix for the *2003* flat xml file format as opposed to the .zip file based Office Open XML format ?
MS Office 2007 (no idea which exact version; can't find the About dialog anymore) can't open such a file.
So let me put it this way: But when a user changes the file extension or when he gets a file with the wrong extension, does he know about this distinction? If you support this to load files, how much damage can it cause?
I think in the worst case, LO will load the file while MS Office won't.
These 2003 format xml files often appeared as .doc or .xls to trick word and excel into opening them without extra magic I believe. I'd rather only the minimum necessary bodges into LibreOffice to trick us into opening them as well as such. All things can have unexpected consequences :-)