Bug 57886 - formula mangled when FILEOPEN particular .RTF
Summary: formula mangled when FILEOPEN particular .RTF
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
4.0.0.0.alpha0+ Master
Hardware: Other Windows (All)
: medium major
Assignee: Miklos Vajna
URL:
Whiteboard: target:4.0.0.0.beta2 target:4.1.0
Keywords: filter:rtf, regression
Depends on:
Blocks:
 
Reported: 2012-12-04 17:14 UTC by s-joyemusequna
Modified: 2015-12-17 12:08 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
binary chopped down document... (67.70 KB, application/rtf)
2012-12-06 20:02 UTC, Michael Meeks
Details
an even more minimal document (61.76 KB, application/rtf)
2012-12-06 20:37 UTC, Michael Meeks
Details

Note You need to log in before you can comment on or make changes to this bug.
Description s-joyemusequna 2012-12-04 17:14:58 UTC
Problem description: 

LOdev 4.0 hangs when loading the file rtf_spec.rtf (from Bug 44736, comment 2, attachment 56109 [details]).

Tested with Version 4.0.0.0.alpha1+ (Build ID: ac4d26e3fc2728ee80f33a485540d50b48927dd), 2012.12.03 under Windows XP and Vista 64. 

It works with LibO 3.6.3.

Steps to reproduce:
1. Open the file.
2. LibO hangs.

Current behavior: LibO hangs.

Expected behavior: LibO opens the file.
Comment 1 Rainer Bielefeld Retired 2012-12-05 06:24:45 UTC
More or less  [Reproducible] with parallel installation of  "LOdev  4.0.0.0.alpha1   -  ENGLISH UI / German Locale  [Build ID: dec8fe)]"  {tinderbox: @6, pull time 2012-11-13 06:07:28} on German WIN7 Home Premium (64bit) with separate /4 User Profile for Master Branch.

I tried to open the document form LibO File dialog, at 80% progress bar mouse cursor changes to "busy", and there was no more progress for 10 minutes

Opening the document with "LibreOffice 3.6.4.3 rc" German UI/ German Locale [Build-ID: 2ef5aff] {pull date 2012-11-28} on German WIN7 Home Premium (64bit) also was some hard work (took 2 minutes or so), but document became opened.

No problem with 4.0 to open a .odt created with 3.6.4.3 from the .rtf document


@Miklós:
Please set Status to ASSIGNED and add yourself to "Assigned To" if you accept this Bug or forward the Bug if it's not your turf
Comment 2 Julien Nabet 2012-12-05 20:10:26 UTC
Just for info, I reproduced this problem (I didn't wait for 10 minutes however) and noticed these kind logs:
warn:legacy.osl:22781:1:/home/julien/compile-libreoffice/libo/sal/rtl/source/ustring.cxx:590: rtl_string2UString_status() - Wrong TextEncoding
warn:writerfilter:22781:1:/home/julien/compile-libreoffice/libo/writerfilter/source/dmapper/GraphicImport.cxx:1550: failed. Message :GraphicCrop
warn:legacy.osl:22781:1:/home/julien/compile-libreoffice/libo/writerfilter/source/dmapper/DomainMapper_Impl.cxx:3149: Exception in CloseFieldCommand()
warn:writerfilter:22781:1:/home/julien/compile-libreoffice/libo/writerfilter/source/rtftok/rtfdocumentimpl.cxx:144: trying to set property when no type is defined

(pc Debian x86-64 with master sources updated today (commit 9f417544f83fb5645abd7b74382bede2246c73b8)
Comment 3 Michael Meeks 2012-12-06 16:28:25 UTC
Great - the way to debug a hang like this is to manually abort it when it is hung; get a backtrace in gdb - and then run 'finish' until the method fails to finish; this method (somehow) is the culprit ;-)

That looks a bit like this:

(gdb) bt
#0  0xad49dc71 in size (this=0xa1554d8) at /usr/include/c++/4.6/bits/stl_vector.h:571
#1  oox::formulaimport::XmlStream::currentToken (this=0xa1554d8) at /data/opt/libreoffice/master/oox/source/mathml/importutils.cxx:205
#2  0xab77ae46 in SmOoxmlImport::readOMathArg (this=0xbfffca74) at /data/opt/libreoffice/master/starmath/source/ooxmlimport.cxx:93
#3  0xab77d601 in SmOoxmlImport::handleStream (this=0xbfffca74) at /data/opt/libreoffice/master/starmath/source/ooxmlimport.cxx:72
#4  0xab77d6d7 in SmOoxmlImport::ConvertToStarMath (this=0xbfffca74) at /data/opt/libreoffice/master/starmath/source/ooxmlimport.cxx:57
#5  0xab756f1a in SmDocShell::readFormulaOoxml (this=0xadb67a0, stream=...) at /data/opt/libreoffice/master/starmath/source/document.cxx:1005
#6  0xab78a7ff in SmModel::readFormulaOoxml (this=0xaccb760, stream=...) at /data/opt/libreoffice/master/starmath/source/unomodel.cxx:1139
#7  0xad890c25 in writerfilter::rtftok::RTFDocumentImpl::popState (this=0xa155010)
    at /data/opt/libreoffice/master/writerfilter/source/rtftok/rtfdocumentimpl.cxx:3887
#8  0xad8a0d4e in writerfilter::rtftok::RTFTokenizer::resolveParse (this=0xa1743a0)
    at /data/opt/libreoffice/master/writerfilter/source/rtftok/rtftokenizer.cxx:124
#9  0xad87984f in writerfilter::rtftok::RTFDocumentImpl::resolve (this=0xa155010, rMapper=...)
    at /data/opt/libreoffice/master/writerfilter/source/rtftok/rtfdocumentimpl.cxx:603
#10 0xad91e5c0 in RtfFilter::filter (this=0xa12d348, aDescriptor=uno::Sequence of length 12 = {...})
    at /data/opt/libreoffice/master/writerfilter/source/filter/RtfFilter.cxx:115
#11 0xb77b7ff4 in SfxObjectShell::ImportFrom (this=0xa0b08b0, rMedium=..., bInsert=false)
    at /data/opt/libreoffice/master/sfx2/source/doc/objstor.cxx:2223

(gdb) finish
Run till exit from #0  0xad49dc71 in size (this=0xa1554d8) at /usr/include/c++/4.6/bits/stl_vector.h:571
205	    if( pos >= tags.size())
(gdb) finish
Run till exit from #0  oox::formulaimport::XmlStream::currentToken (this=0xa1554d8)
    at /data/opt/libreoffice/master/oox/source/mathml/importutils.cxx:205
0xab77ae46 in SmOoxmlImport::readOMathArg (this=0xbfffca74) at /data/opt/libreoffice/master/starmath/source/ooxmlimport.cxx:93
93	    while( !stream.atEnd() && stream.currentToken() != CLOSING( stream.currentToken()))
Value returned is $2 = 1074924649
(gdb) finish
Run till exit from #0  0xab77ae46 in SmOoxmlImport::readOMathArg (this=0xbfffca74)
    at /data/opt/libreoffice/master/starmath/source/ooxmlimport.cxx:93
SmOoxmlImport::handleStream (this=0xbfffca74) at /data/opt/libreoffice/master/starmath/source/ooxmlimport.cxx:73
73	        if( item.isEmpty())
Value returned is $3 = ""
(gdb) finish
Run till exit from #0  SmOoxmlImport::handleStream (this=0xbfffca74) at /data/opt/libreoffice/master/starmath/source/ooxmlimport.cxx:73
...

This never completes: so I assume we simply don't make progress along the stream inside this loop: investigating.
Comment 4 Michael Meeks 2012-12-06 19:26:42 UTC
Looks like the token it hangs on is:

    // 40120C69 -> c69 == 3177
    // const sal_Int32 XML_mPr = 3177;

    while( !stream.atEnd() && stream.currentToken() != CLOSING( stream.currentToken()))

poking at tokens.hxx - so some \\mm or \\mmPr must be behaving oddly I suppose.
Comment 5 Michael Meeks 2012-12-06 20:02:39 UTC
Created attachment 71095 [details]
binary chopped down document...

In such cases it really helps to have a much much smaller document to work with: here is just such a document - it contains a single formula and some text which causes the hang.
Comment 6 Michael Meeks 2012-12-06 20:05:14 UTC
--- a/starmath/source/ooxmlimport.cxx
+++ b/starmath/source/ooxmlimport.cxx
@@ -90,11 +90,22 @@ OUString SmOoxmlImport::handleStream()
 OUString SmOoxmlImport::readOMathArg()
 {
     OUString ret;
+    // 40120C69 -> c69 == 3177
+    // const sal_Int32 XML_mPr = 3177;
+
+    if( !stream.atEnd() && stream.currentToken() == CLOSING( stream.currentToken()))
+    {
+        fprintf (stderr, "BUG BUG BUG - WBUG !\n");
+        if (getenv ("WORKAROUND"))
+            stream.handleUnexpectedTag();
+    }
+
     while( !stream.atEnd() && stream.currentToken() != CLOSING( stream.currentToken()))
     {
         if( !ret.isEmpty())

With this - we at least continue to make progress through the document when we hit an unexpected CLOSING that is not a close oMath from handleStream :-)
Comment 7 Michael Meeks 2012-12-06 20:37:13 UTC
Created attachment 71096 [details]
an even more minimal document

With an in-line formula, with a single simple integral - ho hum.
Oddly it imports ~perfectly as .docx
Comment 8 Michael Meeks 2012-12-13 14:31:23 UTC
Miklos reports that we're generating OOXML tokens - so (in theory) it'd be possible to bust the parser like this anyway. The hang is in the oox formula parser.

Any chance of a quick look Lubos ? :-)
Comment 9 Not Assigned 2012-12-18 14:51:15 UTC
LuboÅ¡ LuÅak committed a patch related to this issue.
It has been pushed to "libreoffice-4-0":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=c237369bc2f4a1931e241c8f6efd7c2854ee657b&g=libreoffice-4-0

avoid infinite loop when parsing malformed ooxml math (fdo#57886)


It will be available in LibreOffice 4.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 10 Not Assigned 2012-12-18 14:51:35 UTC
LuboÅ¡ LuÅak committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=8e5fe8dc9229043d18f581f2c56d58daff2bf87a

avoid infinite loop when parsing malformed ooxml math (fdo#57886)



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 11 Luboš Luňák 2012-12-18 14:52:52 UTC
I've avoided the infinite loop, but it appears that the rtf import converts the math stuff incorrectly for the ooxml parser, so reassigning to Miklos for checking that.
Comment 12 Michael Meeks 2012-12-20 19:14:15 UTC
re-titling, no longer a crash (thanks Lubos) :-)
Comment 13 Not Assigned 2012-12-22 19:01:03 UTC
Miklos Vajna committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=71061656d459abecfe55e8725900d699174325df

fdo#57886 fix import of RTF_MLIMLOC



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 14 Not Assigned 2012-12-22 19:22:08 UTC
Miklos Vajna committed a patch related to this issue.
It has been pushed to "libreoffice-4-0":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=8313aae33b689486fde276af8ab065f557dea74d&g=libreoffice-4-0

fdo#57886 fix import of RTF_MLIMLOC


It will be available in LibreOffice 4.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 15 Miklos Vajna 2012-12-22 19:31:23 UTC
Fixed in master and -4-0, marking as resolved.
Comment 16 Robinson Tryon (qubit) 2015-12-17 12:08:55 UTC
Migrating Whiteboard tags to Keywords: (filter:rtf)
Replace rtf_filter -> filter:rtf.
[NinjaEdit]