Bug 38057 - FILESAVE - FILEOPEN - saving a large doc file in rtf-format results in a very big file which LibO can't even open
Summary: FILESAVE - FILEOPEN - saving a large doc file in rtf-format results in a very...
Status: CLOSED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
3.4.0 release
Hardware: All All
: medium major
Assignee: Miklos Vajna
URL:
Whiteboard:
Keywords:
: 44157 (view as bug list)
Depends on:
Blocks:
 
Reported: 2011-06-07 19:01 UTC by Marius
Modified: 2012-03-19 05:36 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
test file packed with rar (657.27 KB, application/x-rar-compressed)
2011-06-07 19:14 UTC, Marius
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Marius 2011-06-07 19:01:30 UTC
In you save the 3.4 MB file as rtf you obtain a 27 MB file (!!!) that cannot be open. Furthmore if you modify the doc file and try to save it you'll obtain a doc file which will be double in size.

The same operation in LibO 3.3.2 shows a 3.7 MB rtf file and a 3.4 MB doc file.
Comment 1 Marius 2011-06-07 19:14:24 UTC
Created attachment 47700 [details]
test file packed with rar
Comment 2 Jean-Baptiste Faure 2011-12-17 17:13:44 UTC
LO 3.5.0 beta-1 save the bugdoc in RTF without problem. The file is of the same magnitude as the .doc (~ 3.6 Mio when the .doc is 3.2 Mio).
The problem is that LO has big difficulties to open the RTF, it need a lot of RAM and CPU, when Abiword opens this file without problem.

I guess there is a memory leak in the RTF filter.

Miklos: perhaps you should have a look to this particular file. Feel free to reassign if you can't handle this bug.

Best regards. JBF
Comment 3 Miklos Vajna 2012-01-10 07:36:28 UTC
So the tokenizer itself (aka "rtf import filter") spends 12993ms on importing the rtf doc (as JBF says, the size is 3,6M here as well on master - but that is supposed to be similar in earlier versions as well). I think that is far for such a document of 500 pages for a text-based format (of course a binary format like .doc will be faster, that's not news).

What can be improved here is a progressbar like the doc or the odt importer already has, will look into that.
Comment 4 Miklos Vajna 2012-01-10 16:59:05 UTC
(In previous comment: s/far/fair/.)

Progressbar is implemented in master: http://cgit.freedesktop.org/libreoffice/core/commit/?id=92c7b6733e55a6ab62bc231ecf0ffd5c0da7c8d2
Comment 5 Jean-Baptiste Faure 2012-01-11 11:35:06 UTC
Hi Miklos,

Is this progress bar solving bug 44157 too ?

Best regards. JBF
Comment 6 Miklos Vajna 2012-01-11 12:19:47 UTC
*** Bug 44157 has been marked as a duplicate of this bug. ***
Comment 7 Miklos Vajna 2012-01-11 12:21:03 UTC
Hi JBF,

I think so - unless the reporter attaches a document which is special in some way. Thanks for the hint, I closed that bug for now as duplicate of this one.

Miklos
Comment 8 Jean-Baptiste Faure 2012-03-19 05:36:59 UTC
Fix confirmed with LO 3.5.1. The size of the RTF file produced by LO 3.5.1 is 3.7 MB.
Closing. Thank you.