Bug 75930 - IMPORT MathML: some characters are missing
Summary: IMPORT MathML: some characters are missing
Status: ASSIGNED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Formula Editor (show other bugs)
Version:
(earliest affected)
4.2.0.0.beta1
Hardware: Other All
: medium normal
Assignee: dante19031999
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: MathML
  Show dependency treegraph
 
Reported: 2014-03-09 01:06 UTC by Mike Kaganski
Modified: 2020-11-13 06:25 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
ZIP with problematic MML and screenshot (46.60 KB, application/zip)
2014-03-09 01:06 UTC, Mike Kaganski
Details
Actual encoding (LO 4.2.2.1) (1.76 KB, application/mathml+xml)
2014-03-10 11:07 UTC, Jacques Guilleron
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Kaganski 2014-03-09 01:06:54 UTC
Created attachment 95380 [details]
ZIP with problematic MML and screenshot

Importing the MML file from attachment skips some characters.
Seems that they are characters that immediately precede numeric character references (&#xXXXX;).

Also, in the attachment there is a screenshot displaying the current state of import, and the expected result. Problematic places are marked.

Tested with 4.2.0.0.beta1-4.2.2.1 under Win7x64, and 4.2.1.1 under Ubuntu 13.10 x64. Previous versions couldn't handle this file at all.

Marcos, adding you to CC as you suggested in bug 59642, because you are the expert in this area. Please excuse me if it's a wrong thing to do.
Comment 1 Jacques Guilleron 2014-03-10 11:05:09 UTC
Hi Mike,

Reproduced with LO 4.2.2.1 and LO 4.3.0.0.alpha0+
Build ID: 7122ef19847b26529ed1d5bad40df869e91a8495
TinderBox: Win-x86@39, Branch:master, Time: 2014-03-06_00:38:21
& Windows 7 Home Premium.
Confirm also that this file cannot be opened with LO 3.6.6.2
Add for comparison the actual encoding (LO 4.2.2.1) for correct displaying.

Set status to NEW.

Kind regards,

Jacques
Comment 2 Jacques Guilleron 2014-03-10 11:07:02 UTC
Created attachment 95498 [details]
Actual encoding (LO 4.2.2.1)
Comment 3 QA Administrators 2016-02-21 08:35:22 UTC Comment hidden (obsolete)
Comment 4 Mike Kaganski 2016-02-22 08:57:10 UTC
Still reproducible with 5.1.0.3
Comment 5 QA Administrators 2017-03-06 15:13:58 UTC Comment hidden (obsolete)
Comment 6 Mike Kaganski 2017-03-06 20:31:58 UTC
reproducible with 5.3.1.1.
Comment 7 QA Administrators 2018-03-07 03:41:16 UTC Comment hidden (obsolete)
Comment 8 Mike Kaganski 2018-03-07 04:22:43 UTC
Still present in Version: 6.0.2.1 (x64)
Build ID: f7f06a8f319e4b62f9bc5095aa112a65d2f3ac89
CPU threads: 4; OS: Windows 10.0; UI render: GL; 
Locale: ru-RU (ru_RU); Calc: CL
Comment 9 QA Administrators 2019-07-15 02:48:29 UTC Comment hidden (obsolete)
Comment 10 Julien Nabet 2020-08-01 14:37:28 UTC
On pc Debian x86-64 with master sources updated today, I could open second attachment (fa.mml).

About first one which contains f.mml, it displays:
OT instead of italic OM but don't know if it's ok.

Any update here with LO 6.4.5?
Comment 11 dante19031999 2020-11-12 22:52:17 UTC
The bug is real. Lo has no &HEX; support for mathml. Only accepts unicode first character of the string.
Comment 12 Julien Nabet 2020-11-13 06:25:42 UTC
Dante: let's put this one to ASSIGNED since you assigned yourself.