Bug 66081 - Improve grouping of binary operators
Summary: Improve grouping of binary operators
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Formula Editor (show other bugs)
Version:
(earliest affected)
4.2.0.0.alpha0+ Master
Hardware: All All
: medium normal
Assignee: Frédéric Wang
URL:
Whiteboard: target:4.2.0
Keywords:
Depends on:
Blocks:
 
Reported: 2013-06-23 13:57 UTC by Frédéric Wang
Modified: 2013-06-28 10:55 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Frédéric Wang 2013-06-23 13:57:15 UTC
This has been reported on the MathJax mailing list:
https://groups.google.com/forum/#!msg/mathjax-users/_naUsEP0Rxs/OH33AWIc5akJ

If you type e.g.

"a + 2 b"

the grouping in the parsed tree will be

"{a + 2} b"

rather than 

"a + {2 b}"

Note that "a + 2 * b" will produce the correct "a + {2 * b}".

Also, binary operations like

a*b*c + d*e*f + g*h*i

are grouped like this

{ { { {{a*b}*c} + {{d*e}*f} } + {{g*h}*i} }

while operators of same priority could just be grouped together like this:

{ {a*b*c} + {d*e*f} + {g*h*i} }

This is not really visible in the editor, but one can see many <mrow> elements if you export to MathML.

I think the solution to the first problem will be to interpret "2 b" as a Product (using the terminology of the grammar in http://cgit.freedesktop.org/libreoffice/core/tree/starmath/source/parse.cxx#n1170), so that it becomes of priority higher than Sum. If we fix bug 55853 without breaking https://issues.apache.org/ooo/show_bug.cgi?id=11752, then "2b" (without space) should become of priority higher than SubSup. I guess this can be done by grouping token variable and number tokens in one Term here: http://cgit.freedesktop.org/libreoffice/core/tree/starmath/source/parse.cxx#n1420.

The second problem is less serious (the operator priorities are respected), so that could just be fixed in mathmlexport.cxx where binary operators of same priority will be grouped in one <mrow>. We can use the priority from the MathML operator dictionary (http://www.w3.org/TR/MathML3/appendixc.html#oper-dict.entries-table) but the simplest will be to use the nGroup property of the token associated to the binary op node.
Comment 1 Jorendc 2013-06-24 23:00:12 UTC
Makes sense to me. And following the fact it looks like you are an expert on this domain -> NEW right away.

Kind regards,
Joren
Comment 2 Frédéric Wang 2013-06-25 16:52:29 UTC
It turns out that the first problem is a bit more complicated than I thought. At the moment the grammar is basically

An Expression is a list of Relations
A Relation is a list of Sums separated by relation operators
A Sum is a list of Product separated by sum-like operators
Product is a list of Power separated by product-like operators
Power is a base and various sub/supscript Terms attached with _, ^ etc
Term are basic tokens (identifier, numbers...) and various other commands.

First you want to consider implicit product. So for example one would also like

1 + 3 x^2

to be interpreted as 1 + {3 x^2} rather than {1 + 3} x^2 (as it is currently the case). So a natural method could be to redefine

Sum as a list of ImplicitProducts as a list separated by sum-like operators
ImplicitProducts as a list of Product

I tried that but then Binom is defined as a pair of two Sum, so for example "binom a b" will be interpreted as "binom {a b}" with the new syntax. I tried changing Binom to a pair of two Product make, but other test failures happened.

Probably, allowing implicit products will lead to an ambiguous grammar (if it is not already) and a more clever parser would be necessary...
Comment 3 Frédéric Wang 2013-06-25 20:38:23 UTC
I've submitted a patch for the second problem:
https://gerrit.libreoffice.org/#/c/4520
Comment 4 Frédéric Wang 2013-06-26 09:30:37 UTC
I've opened bug 66200 for the first problem. I gave up fixing it in the short term as I suspect it would require too much work for only small benefits (the bad parsing of product does not really seem to affect the rendering or navigation in Math but is visible in the MathML output).
Comment 5 Frédéric Wang 2013-06-27 21:57:04 UTC
Marking these bugs assigned since I've already taken them.
Comment 6 Commit Notification 2013-06-28 09:58:46 UTC
Frederic Wang committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=4f294a90877d2f91bb88c7d6cd5b74e8e546a025

 fdo#66081 - reduce the number of nested <mrow>'s in MathML



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 7 Frédéric Wang 2013-06-28 10:29:06 UTC
I guess this can be closed since the second problem should be fixed and I've opened bug 66200 for the first problem.

Testcase try to export

"a*b*c + d*e*f + g*h*i = 
a*b*c + d*e*f + g*h*i = 
a*b*c + d*e*f + g*h*i"

to MathML .mml to see the difference.
Comment 8 Jorendc 2013-06-28 10:55:49 UTC
(In reply to comment #7)
> I guess this can be closed since the second problem should be fixed and I've
> opened bug 66200 for the first problem.

Okay, wasn't sure about this one :). -> RESOLVED FIXED

_o_ thank you :)