Bug 66276 - MathML export: avoid using combining characters for accents and diacritical marks
Summary: MathML export: avoid using combining characters for accents and diacritical m...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Formula Editor (show other bugs)
Version:
(earliest affected)
4.2.0.0.alpha0+ Master
Hardware: All All
: medium normal
Assignee: Frédéric Wang
URL:
Whiteboard: target:4.2.0
Keywords:
Depends on:
Blocks:
 
Reported: 2013-06-27 21:08 UTC by Frédéric Wang
Modified: 2013-07-02 09:03 UTC (History)
0 users

See Also:
Crash report or crash signature:


Attachments
Sample output (6.61 KB, text/html)
2013-06-30 10:55 UTC, Frédéric Wang
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Frédéric Wang 2013-06-27 21:08:57 UTC
As indicated in the MathML spec:

"In the UCS there are many combining characters that are intended to be used for the many accents of numerous different natural languages. Some of them may seem to provide markup needed for mathematical accents. They should not be used in mathematical markup. Superscript, subscript, underscript, and overscript constructions as just discussed above should be used for this purpose. Of course, combining characters may be used in multi-character identifiers as they are needed, or in text contexts."

LibreOffice should try to use the non-combining versions when possible. Some of the work is already done in bug 66024. BTW, "U+20D7 COMBINING RIGHT ARROW ABOVE" looks really ugly in Firefox so I wonder if it should be replaced by "U+2192 RIGHTWARDS ARROW", at least for MathML export.
Comment 1 Frédéric Wang 2013-06-27 21:55:32 UTC
Mass changes to assign bugs to myself.
Comment 2 Frédéric Wang 2013-06-30 10:55:23 UTC
Created attachment 81733 [details]
Sample output

I've submitted a patch for review:

https://gerrit.libreoffice.org/#/c/4630/

I attach a testcase comparing the old and new output. The difference is not very visible visually (you can see in e.g. gedit that the accents no longer combine with the previous ">" char). In Firefox you can see that some diacritical marks are now better centered.
Comment 3 Commit Notification 2013-07-02 07:46:22 UTC
Frederic Wang committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=fbc9c18875d1e86c9b3d7d5c13e1db13af23e3f0

 fdo#66276 - MathML export: avoid using combining characters.



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 4 Frédéric Wang 2013-07-02 09:03:14 UTC
I'm closing this, although "U+2192 RIGHTWARDS ARROW" is still used instead of "U+20D7 COMBINING RIGHT ARROW ABOVE" and I'm still not sure what would be the best way to deal with that. Another bug can be opened later if necessary.