Tools -> Import Formula -> leave the file filter set to "All files (*.*)" -> select a .mml file -> Open. Expected: the file should be imported. Actual result: nothing happens, formula is not imported. If the filter is explicitly set to "MathML 1.01 (*.mml)", then the export completes successfully.
Reproducible in 4.1.0 alpha version in Linux too. So, chaging to platform to All
Marcos Paulo de Souza committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=ef4872f6c6a49878ea2fc6b3a4ea3cd9ad52de33 Fixes fdo#59642 The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
It's fixed in master! Thanks for the report!
Marcos Paulo de Souza committed a patch related to this issue. It has been pushed to "libreoffice-4-1": http://cgit.freedesktop.org/libreoffice/core/commit/?id=923b05ac37355886e7a20fdcb1ad9dcb198f7570&h=libreoffice-4-1 Fixes fdo#59642 It will be available in LibreOffice 4.1. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Hello Marcos, Thank you for looking into this. Sorry to say this, but I tried the 4.1.0.1 and the problem is still there... :( Maybe the fox will be in a second RC?
Hi Mike, Is strange because the fix was applied to rc1, as described here: https://wiki.documentfoundation.org/Releases/4.1.0/RC1 But, can you test the rc2 please? (when it's available)
I looked into the commits, and it seems that they just cannot fix this issue: The fix covers yet another file marker: "<math:math ". But the problem is already present with files having "old" markers, like, say, Attachment 73352 [details] from Bug #59641. That file contains plain old "<math ".
By the way: the signatures there could be followed by any whitespace character, not only SPACE: that could also be TAB, LF or CR (see http://www.w3.org/TR/2008/REC-xml-20081126/#NT-S)
Hi Mike! How, you're right =D There is one more option to fix the auto detection. I'll fix this, thanks a lot for looking at this!
Hi guys! I looked at that mml file. This file was a little different from the others: it have the BOM marker: www.w3.org/International/questions/qa-byte-order-mark Do you know if the LO has code to avoid this marker, or maybe handle this?
Hi marcos, unfortunately, I don't know this, but LO imports it correctly if the type is explicitly selected. I would vote for local solution (checking for the two possible BOMs before searching for markers), and explaining this in comments, so that it would be possible to search for string "BOM" or "Byte Order Mark", and a person doing unification could benefit from this.
(In reply to comment #10) > Do you know if the LO has code to avoid this marker, or maybe handle this? Here is what I found: http://docs.libreoffice.org/tools/html/classSvStream.html#ab1d78e3df7058ca99859cb43d3b227a6 sal_Bool SvStream::StartReadingUnicodeText(rtl_TextEncoding eReadBomCharSet) If eReadBomCharSet==RTL_TEXTENCODING_DONTKNOW: read 16bit, if 0xfeff do nothing (UTF-16), if 0xfffe switch endian swapping (UTF-16), if 0xefbb or 0xbbef read another byte and check for UTF-8. If no UTF-* BOM was detected put all read bytes back. This means that if 2 bytes were read it was an UTF-16 BOM, if 3 bytes were read it was an UTF-8 BOM. There is no UTF-7, UTF-32 or UTF-EBCDIC BOM detection! If eReadBomCharSet!=RTL_TEXTENCODING_DONTKNOW: only read a BOM of that encoding and switch endian swapping if UTF-16 and 0xfffe. === The pStrm is SvStream*, so everything seems as easy as adding this just before line 316 of smdetect.cxx http://cgit.freedesktop.org/libreoffice/core/tree/starmath/source/smdetect.cxx#n316 : pStrm->StartReadingUnicodeText(RTL_TEXTENCODING_DONTKNOW); sal_uLong nBytesRead = pStrm->Read( aBuffer, nSize );
And by the way, the testcase I mentioned in Comment 7 (Attachment 73352 [details]) have no <?xml?> in the beginning, so this will make checks in line 319 to fail: if (0 == strncmp( "<?xml",aBuffer,nSize)) According to [1], the XML declaration is optional, so the testcase is well-formed. So I suppose, that the checks should be relaxed. However, this is questionable, because it may give false positives. What do you think? [1] http://www.w3.org/TR/2008/REC-xml-20081126/#sec-prolog-dtd
Hi Mike! Thanks a lot for the comments! I added a commit in gerrit: https://gerrit.libreoffice.org/#/c/4742/ Now it check if we have the <?xml, and if not, it's good too =D And I avoided the BOM character, so I beleive it
(In reply to comment #14) > Hi Mike! > > Thanks a lot for the comments! > > I added a commit in gerrit: https://gerrit.libreoffice.org/#/c/4742/ > > Now it check if we have the <?xml, and if not, it's good too =D > > And I avoided the BOM character, so I beleive it it's fine now :) Thanks a lot for looking at this!
Marcos Paulo de Souza committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=05530423d3cff1391769192a62ae470500978ee6 Solve one more issue of fdo#59642 The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Thank you for working on it! Will test as soon as it is available.
Verified fixed in Version: 4.2.0.0.alpha0+ Build ID: 4374e5c80525cd1a9d9ab04714ccbf2543a912ce Thank you!
Thanks for verifying Mike! If you find more MathML files that Math can't handle, please open this bug again, or send me an email!