Bug 33849 - smath does not handle accents in MathML
Summary: smath does not handle accents in MathML
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Formula Editor (show other bugs)
(earliest affected)
3.5.0 Beta2
Hardware: Other All
: medium normal
Assignee: Caolán McNamara
QA Contact:
Depends on:
Reported: 2011-02-02 12:08 CET by Joshua Cogliati
Modified: 2012-01-06 20:14 CET (History)
3 users (show)

See Also:
Crash report or crash signature:

MathML using accents. (345 bytes, application/mathml+xml)
2011-02-02 12:08 CET, Joshua Cogliati
Patch to fix most of the accent problems (1.19 KB, patch)
2011-12-30 09:11 CET, Joshua Cogliati
Mathml using accents (1.14 KB, application/mathml+xml)
2012-01-03 05:25 CET, Joshua Cogliati
Patch to fix all simple accent problems (2.08 KB, patch)
2012-01-04 20:33 CET, Joshua Cogliati

Note You need to log in before you can comment on or make changes to this bug.
Description Joshua Cogliati 2011-02-02 12:08:09 CET
Created attachment 42865 [details]
MathML using accents.

Description of problem:
Libreoffice smath does not handle accents in MathML.  LibreOffice relies on
a annotation to be able to display and edit accents.  If this is stripped away
(as Microsoft Office 2007 does) the formulas are not properly displayed.  

Version-Release number of selected component (if applicable):

How reproducible:
Every time.

Steps to Reproduce:
1. go into smath
2. create a formula "hat a vec b"
3. Save as MathML
4. Remove the <math:annotation math:encoding="StarMath 5.0">hat a vec
b</math:annotation> in the file.
5. Open with smath.

Actual results:

Expected results:
the result of hat a vec b

Additional info:
This is a bug in LibreOffice, since according to the ODF spec it should be able
to read in a MathML file.  For reference firefox displays the attached mathml
file correctly (so LibreOffice writes out a correct MathML file).  It looks like
LibreOffice needs to recognize when there is an accent, and properly deal with

This is also a bug in OpenOffice: 
Comment 1 Laurent BP 2011-08-24 12:36:24 CEST
Reproduce with LibO 3.4.2 on WinXP
Comment 2 Björn Michaelsen 2011-12-23 11:48:08 CET
[This is an automated message.]
This bug was filed before the changes to Bugzilla on 2011-10-16. Thus it
started right out as NEW without ever being explicitly confirmed. The bug is
changed to state NEEDINFO for this reason. To move this bug from NEEDINFO back
to NEW please check if the bug still persists with the 3.5.0 beta1 or beta2 prereleases.
Details on how to test the 3.5.0 beta1 can be found at:

more detail on this bulk operation: http://nabble.documentfoundation.org/RFC-Operation-Spamzilla-tp3607474p3607474.html
Comment 3 Laurent BP 2011-12-26 12:54:23 CET

I can confirm bug on LibO 3.5b2 on WinXP.
Comment 4 Joshua Cogliati 2011-12-30 09:11:19 CET
Created attachment 54976 [details]
Patch to fix most of the accent problems

Adds the needed cases to CreateTextFromNode to accent related problems with importing from mathml such as vectors and hat accents.
Comment 5 Joshua Cogliati 2012-01-03 05:25:43 CET
Created attachment 55079 [details]
Mathml using accents

Attachment that uses all the regular sized accents.
Comment 6 Joshua Cogliati 2012-01-03 05:30:12 CET
Right now, the patch works adds support for all libre office regular sized accents except for dddot, which libre office outputs as 0x20DB, but expects as 0xE09B.  I this could be fixed by changing the patch to add something like:
0xE09B is in the private use area of unicode.
            case 0xE09B:
+           case 0x20DB:
                APPEND(rText,"dddot ");
Comment 7 Joshua Cogliati 2012-01-04 20:33:14 CET
Created attachment 55147 [details]
Patch to fix all simple accent problems

Adds the needed cases to CreateTextFromNode to fix accent related problems with
importing from mathml such as vectors and hat accents.

Compared to the previous patch, this fixes dddot.  This patch also adds comments that tell which unicode character is being matched.  

Now all the following can be read from a mathml file:
acute a grave b check c breve d circle e vec f tilde g hat h bar i dot j ddot k 
dddot l 

I believe this patch is done, and I request that it be considered for inclusion in 3.5
Comment 8 Caolán McNamara 2012-01-06 03:01:16 CET
looks sane, committed as http://cgit.freedesktop.org/libreoffice/core/commit/?id=b90ac7d682fd65f75eff4225d871130c0ae9f185

caolanm->Joshua: can you add yourself to http://wiki.documentfoundation.org/Development/Developers and send to the list (like the examples there) a statement that the patch is under our preferred LGPLv3+/MPLv1.1

caolanm->llunak: you're closest to the import mathml in ooxml stuff, does this change help that out ? and/or worth cherry-picking for 3-5 ?
Comment 9 Lubos Lunak 2012-01-06 03:38:44 CET
This fix is irrelevant for ooxml, I made it do the necessary conversions already on import. I guess this makes sense, also for 3-5.

Comparing it to my import code, I see there are widehat and widevec missing in this list, and there are symbolic constants for the hexa values. I'll change this and push to 3-5 too.
Comment 10 Lubos Lunak 2012-01-06 03:46:04 CET
Ignore the second paragraph, I got it backwards. I'll just backport as it is and use the symbolic contants in master.
Comment 11 Joshua Cogliati 2012-01-06 20:14:48 CET
Sounds good.  FYI the patch I wrote can be used with LGPLv3+/MPLv1.1.