Bug 60419 - Big documents, containing a big content.xml consumes lots of memory and CPU and might even freeze
Summary: Big documents, containing a big content.xml consumes lots of memory and CPU a...
Status: RESOLVED DUPLICATE of bug 60418
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
3.6.5.2 release
Hardware: All All
: high major
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-02-07 12:18 UTC by GiorgioMigliaccio
Modified: 2013-02-07 15:12 UTC (History)
0 users

See Also:
Crash report or crash signature:


Attachments
Example of a big document (480.35 KB, application/vnd.oasis.opendocument.text)
2013-02-07 12:18 UTC, GiorgioMigliaccio
Details

Note You need to log in before you can comment on or make changes to this bug.
Description GiorgioMigliaccio 2013-02-07 12:18:10 UTC
Created attachment 74337 [details]
Example of a big document

We’re using LibreOffice (since v2.x up to 4.0 beta) inside our product LetterSketch, which was started 7 years ago. 
LetterSketch is a document authoring tool with which authors can create documents and extend it with conditions, loops, variables, sub documents and other ornaments. 
Finally this template gets compiled to an internal format and finally it can be generated on a server in high volume by feeding it data in XML format.
Now we got our biggest implementation up to date, a major European bank. And they need to create pretty complex document structures, containing multiple sub-documents.
And here we reached the limits of the OpenDocument format, or at least the LibreOffice/OpenOffice internal object representation.

At the customer we created a template/document containing 113 pages, built up with plain text, various objects like comments and frames, and dozens of sub-documents and where it takes some 10-20 minutes to just open the template.

We finally pinpointed the problem. 
When we just take the top-level odt file, without resolving any of the sub-documents, LibreOffice needs 1 minute at 100% CPU(intel i5 CPU) and 500 MB of memory (!!!) for this document alone. 
So when we also start resolving the sub-documents, LibreOffice goes up to 1.8 GB of memory and then just crashes or disappears or freezes, there seems to be some kind of invisible ceiling. 
Closing the document takes another 1-2 minutes and some additional memory is needed!
When saving the main document to the Word (.doc) format and then opening it in Word, MS Word only needs some 30 MB of memory to visualize this SAME document. 
Please find attached the concerning (main) document.

Is there any kind of suggestion you can give us to overcome this problem/limitation, since this is going to be a major showstopper for our project.
Thank you very much for your suggestions.
Comment 1 Jorendc 2013-02-07 13:05:17 UTC
Hi,

Thanks for reporting.

I CAN'T reproduce that you need to wait several minutes to open the document. I opened the document on a i5 @2.8GHz within 10 seconds. Also I can't see any memory leakage when this document is opened. 
But I CAN reproduce the high memory load (+500 MB) and scrolling through the document result sometimes in a short freeze (1 second). It also needs 5 to 6 seconds to close ... I think that's not that bad.

Tested using Linux Mint 14 x64 and LibreOffice 4.0.0.3 rc3.

Following [1] this is not a blocker. Because this results in a tedious slow behavior, I mark this as 'Major High'.

Kind regards,
Joren 

[1] https://wiki.documentfoundation.org/images/0/06/Prioritizing_Bugs_Flowchart.jpg
Comment 2 David 2013-02-07 14:27:10 UTC
My experience since version 3.5 with LO being slow and crashing seems to be when headers & footers are enabled.  You may want to try disabling them and see if that has any affect on the document.  If so it may be a duplicate of bug 59714.
Comment 3 GiorgioMigliaccio 2013-02-07 15:11:11 UTC
(In reply to comment #2)
> My experience since version 3.5 with LO being slow and crashing seems to be
> when headers & footers are enabled.  You may want to try disabling them and
> see if that has any affect on the document.  If so it may be a duplicate of
> bug 59714.

At our customer we've created literally thousands of documents. Most of them contain a header and/or a footer. But every document is a 'fragment', meaning it is or can be a sub-document of another document, and is inserted into this document as a section.
So, we can have some 5 to 10 level-deep sub-documents, and one top-level document easily can contain dozens or more of such sub-documents/sections.
So we gradually build more and more top-level documents containing more content (documents). Now we've come to the situation, a certain threshold, that the main-document contains too many objects that it has become too big to be processed well by LO. We've tried several versions of LibreOffice, we started with 3.5.4, went to  3.6.x and I tried it with 4.0 beta, and eventually tried it with 3.4.5.
This last version (3.4.5) was chosen because I read in a LO bug-report that since 3.5.x there were memory problems with having many images inside a document. 
So switching to 3.4.5 did indeed help a bit. A document which was unable to be opened with other versions of LO finally did open (using 1.8 GB of memory), but it was unworkable. Right-clicking to perform any action froze the UI.
So I think it mainly has to do with the size of the document. For example, loading the 16 MB content.xml file into Liquid XML Studio needs 1.1 GB of memory... and opening it in notepad++ didn't even succeed, it just froze.
Ok, this is another application,  but I think it mainly comes down to the same problem : too much XML for an XML-processor.
Comment 4 GiorgioMigliaccio 2013-02-07 15:12:54 UTC

*** This bug has been marked as a duplicate of bug 60418 ***