Bug 148218 - FILEOPEN RTF Writer uses utterly excessive amounts of RAM with a document having about 2500 pages
Summary: FILEOPEN RTF Writer uses utterly excessive amounts of RAM with a document hav...
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
6.4.0.3 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:26.8.0
Keywords: filter:rtf, haveBacktrace, perf
Depends on:
Blocks: Performance
  Show dependency treegraph
 
Reported: 2022-03-27 12:20 UTC by robert
Modified: 2026-02-13 21:54 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
ZIP'ed RTF file that fills the available RAM (745.79 KB, application/zip)
2022-03-27 12:22 UTC, robert
Details
Screenprint od RAM used by Word 2002 and Writer (291.67 KB, image/png)
2022-03-27 12:23 UTC, robert
Details
Flamegraph (177.79 KB, application/x-bzip)
2022-03-27 19:12 UTC, Julien Nabet
Details

Note You need to log in before you can comment on or make changes to this bug.
Description robert 2022-03-27 12:20:43 UTC
Description:
Loaded the attached ZIPped RTF (filesizes l.zip: 746kb, l.rtf: 6,729kb) and then looked at the RAM used using MS Process Explorer, for both 

Writer (Version: 7.3.1.3 (x64) / LibreOffice Community
Build ID: a69ca51ded25f3eefd52d7bf9a5fad8c90b87951
CPU threads: 8; OS: Windows 6.1 Service Pack 1 Build 7601; UI render: Skia/Raster; VCL: win
Locale: en-GB (en_GB); UI: en-US
Calc: CL) 

as well as Word (Word 2002) 

Look at the screen print, and cry. It's just utterly inexcusable that editing a 6Mb file required more than 800Mb. It's actually even worse if the RTF file is first saved (just as RTF) without making any changes to it. In that case amount of RAM required balloons up to MORE THAN ONE GIGABYTE!

Steps to Reproduce:
1. Unzip l.zip
2. Open it in Writer and Word
3. Notice the differences in RAM used

Actual Results:
Utterly ridiculous amounts of RAM used, more than 100 times the size of the input file!

Expected Results:
Way less


Reproducible: Always


User Profile Reset: No



Additional Info:
When are the LO developers finally going to do so about issues like this? If you want people with less capable hardware to use the product, this is not the way to go!
Comment 1 robert 2022-03-27 12:22:07 UTC
Created attachment 179139 [details]
ZIP'ed RTF file that fills the available RAM

Just a simple RTF file of about 1500 pages, with just some text bolded, nothing more, nothing less!
Comment 2 robert 2022-03-27 12:23:43 UTC
Created attachment 179140 [details]
Screenprint od RAM used by Word 2002 and Writer

Screen-print png from MS Process Explorer showing the vast difference in RAM used between Word and Writer
Comment 3 Julien Nabet 2022-03-27 17:49:03 UTC
On pc Debian x86-64 with master sources updated today, I waited some seconds to just open it and saw this log repeating:
warn:legacy.osl:17580:17580:writerfilter/source/dmapper/DomainMapper_Impl.cxx:1704: no style sheet found

Miklos: thought you might be interested in this one since it concerns rtf.
Comment 4 Julien Nabet 2022-03-27 19:12:33 UTC
Created attachment 179150 [details]
Flamegraph

Here's a Flamegraph retrieved on pc Debian x86-64 with master sources updated today + gen rendering. (without enable-dbgutil)

Flamegraph is rather for perf pb not for RAM use but since it took some time when opening it with my other local build (with enable-dbgutil).
Comment 5 Buovjaga 2024-09-27 19:19:37 UTC
The memory use is around 400MB with version 7.3 and current master for me on Windows. In 4.3 it is below 100MB and the CPU use is much less and takes a shorter time. It gradually increased and the current level started around version 6.4, but there is inconsistency in the behaviour between bibisect repositories, so I don't see how I could succeed in pinpointing the cause. Others may try, in case they have better luck.
Comment 6 Commit Notification 2026-01-27 16:16:55 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/fba849c048ce41a3e726778b7a088a47faf84b29

tdf#148218 reduce OUString allocations

It will be available in 26.8.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 7 Buovjaga 2026-01-28 07:18:15 UTC
Looking at htop on Linux, 25.8 takes 13.7G virtual memory after opening the .rtf while a build with Noel's commit takes only 2.9G. I am using zram.
Comment 8 Commit Notification 2026-01-29 07:45:06 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/26d80567e7759e4a6358d88a173e603e16ca4517

tdf#148218 reduce OUString allocations in SwScanner::NextWord

It will be available in 26.8.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 9 Commit Notification 2026-01-29 09:23:31 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/ab1a88da2dd50b2cb067a53fbfae21a6564d6a94

tdf#148218 reduce OString alloc in BreakIterator_Unicode::loadICUBreakIterator

It will be available in 26.8.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 10 Commit Notification 2026-01-29 11:11:44 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/b2da15234473c8bda598813c707efb7038c12840

tdf#148218 reduce OString alloc in dictionary searching

It will be available in 26.8.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 11 robert 2026-01-29 21:29:19 UTC Comment hidden (no-value)
Comment 12 Julien Nabet 2026-01-30 08:41:01 UTC
(In reply to robert from comment #11)
> (In reply to Buovjaga from comment #7)
> > Looking at htop on Linux, 25.8 takes 13.7G virtual memory after opening the
> > .rtf while a build with Noel's commit takes only 2.9G. I am using zram.
> 
> Only 2.9Gb? When M$ word uses 26Mb? Are you kidding?
I agree with you, it's not sufficient but 2.9 compared to 13.7 is far better. 
Don't forget that RTF is a format from Microsoft which is a very big company.
Implementing it is not so easy (take a look at https://officeprotocoldoc.z19.web.core.windows.net/files/Archive_References/%5BMSFT-RTF%5D.pdf).

> 
> May I suggest that de developers of LO read (again) this:
> http://www.ncdm.com/bloat/bloat.htm and especially the final conclusion:
> 
> Bloat is not a technical issue, but verily a way of thinking, a "state of
> mind". Its cure is a simple refusal to accept, and a well directed,
> resounding "clean up your act and clean up your code!"
If you know how to code, you can give some help, after all it's open source, if not avoid this kind of remark.

> 
> And no I won't test it, as the developers of LO have zapped support for W7 -
> maybe they can explain why a simple office program needs all the new bloat
> of M$' advertising platforms pretending to be OS'es, when a program like
> Irfanview still runs happily on XP, despite the appearance of numerous new
> image formats, and IBM's z/OS can still run programs written in the 1970ies
> and 1980ies that have never been recompiled. (But that will still compile,
> once the source has been found again)
First release of Win7 has been in 2009 and last updates from Ms are from 2020 (ref https://en.wikipedia.org/wiki/Windows_7), why should we keep on maintaining Win7 compatibility?

Then you really compare a certainly great but small software like Irfanview which runs only on Windows and has far less features than an Office suite which runs on Linux, macOS and Windows?

You're talking about z/OS which is closed OS and on closed hardware supported by one of the biggest company in computing (https://en.wikipedia.org/wiki/Z/OS) so completely irrelevant

Did you noticed that Noel kept on providing some patches after Buovjaga's comment? Did you at least read the commits descriptions and the optimizations done?

Again, if you know how to code, you're welcome to provide some help.
If you've got some money, you can donate to TDF or companies working on LO.
If you have some time, you can contribute to translate or test LO.

If you don't want/can't to do any of these, just don't complain and stop the despising.
Comment 13 robert 2026-02-13 18:34:05 UTC
(In reply to Julien Nabet from comment #12)
> (In reply to robert from comment #11)
> > (In reply to Buovjaga from comment #7)
> > > Looking at htop on Linux, 25.8 takes 13.7G virtual memory after opening the
> > > .rtf while a build with Noel's commit takes only 2.9G. I am using zram.
> > 
> > Only 2.9Gb? When M$ word uses 26Mb? Are you kidding?
> I agree with you, it's not sufficient but 2.9 compared to 13.7 is far
> better. 
> Don't forget that RTF is a format from Microsoft which is a very big company.
> Implementing it is not so easy (take a look at
> https://officeprotocoldoc.z19.web.core.windows.net/files/Archive_References/
> %5BMSFT-RTF%5D.pdf).

And how big is the spec of ODx?
 
> > May I suggest that de developers of LO read (again) this:
> > http://www.ncdm.com/bloat/bloat.htm and especially the final conclusion:
> > 
> > Bloat is not a technical issue, but verily a way of thinking, a "state of
> > mind". Its cure is a simple refusal to accept, and a well directed,
> > resounding "clean up your act and clean up your code!"
>
> If you know how to code, you can give some help, after all it's open source,
> if not avoid this kind of remark.

I comment my code, 'nuff said.

And I don't do any flavour-of-the-hour programming languages, I like safe(r) programming languages, like Pascal, PL/I, REXX and assembler. 
 
> > 
> > And no I won't test it, as the developers of LO have zapped support for W7 -
> > maybe they can explain why a simple office program needs all the new bloat
> > of M$' advertising platforms pretending to be OS'es, when a program like
> > Irfanview still runs happily on XP, despite the appearance of numerous new
> > image formats, and IBM's z/OS can still run programs written in the 1970ies
> > and 1980ies that have never been recompiled. (But that will still compile,
> > once the source has been found again)
> First release of Win7 has been in 2009 and last updates from Ms are from
> 2020 (ref https://en.wikipedia.org/wiki/Windows_7), why should we keep on
> maintaining Win7 compatibility?
> 
> Then you really compare a certainly great but small software like Irfanview
> which runs only on Windows and has far less features than an Office suite
> which runs on Linux, macOS and Windows?

Why can VLC still support XP SP3? 

From their website:

"... The major maintenance effort of this release to strengthen VLC's overall stability as well as ***the compatibility with old releases of Windows and macOS was made possible by a generous sponsorship of the Sovereign Tech Fund (https://www.sovereign.tech/) by Germany's Federal Ministry for Digital Transformation and Government Modernisation.*** ..."

Why didn't LO ask for this, with the German public services on of the larger users of it? Why rip out the old code, making it impossible for others to maintain a version of older OS'es?
 
> You're talking about z/OS which is closed OS and on closed hardware
> supported by one of the biggest company in computing
> (https://en.wikipedia.org/wiki/Z/OS) so completely irrelevant
> 
> Did you noticed that Noel kept on providing some patches after Buovjaga's
> comment? Did you at least read the commits descriptions and the
> optimizations done?

The commits don't make any programmatical sense for people who have now knowledge of the internals of LO.
 
> Again, if you know how to code, you're welcome to provide some help.

I can code, but I have never had any interest in C, C++, the abomination that is Python, or Java. I like the readability of Pascal, PL/I and even assembler, where I know exactly how every instruction affects the data I'm processing. 

> If you've got some money, you can donate to TDF or companies working on LO.

My Dutch and Brexitanian pensions add up to about eur 600 per month, and now that LO has abandoned W7, I'd rather spend it on more useful things, like taking the grandchildren out.

> If you have some time, you can contribute to translate or test LO.

I cannot, I use W7, and next to that, I'm one 90% of the people who only uses 10% of its features.

> If you don't want/can't to do any of these, just don't complain and stop the
> despising.
Comment 14 Julien Nabet 2026-02-13 21:54:38 UTC
(In reply to robert from comment #13)
> (In reply to Julien Nabet from comment #12)
> > (In reply to robert from comment #11)
> > > (In reply to Buovjaga from comment #7)
> > > > Looking at htop on Linux, 25.8 takes 13.7G virtual memory after opening the
> > > > .rtf while a build with Noel's commit takes only 2.9G. I am using zram.
> > > 
> > > Only 2.9Gb? When M$ word uses 26Mb? Are you kidding?
> > I agree with you, it's not sufficient but 2.9 compared to 13.7 is far
> > better. 
> > Don't forget that RTF is a format from Microsoft which is a very big company.
> > Implementing it is not so easy (take a look at
> > https://officeprotocoldoc.z19.web.core.windows.net/files/Archive_References/
> > %5BMSFT-RTF%5D.pdf).
> 
> And how big is the spec of ODx?
yes but odf is the main format of LO so it's quite expected devs try to implement it well. ODF doesn't depend on Microsoft to evolve so it's better spending time supporting ODF than RTF.

>  
> > > May I suggest that de developers of LO read (again) this:
> > > http://www.ncdm.com/bloat/bloat.htm and especially the final conclusion:
> > > 
> > > Bloat is not a technical issue, but verily a way of thinking, a "state of
> > > mind". Its cure is a simple refusal to accept, and a well directed,
> > > resounding "clean up your act and clean up your code!"
> >
> > If you know how to code, you can give some help, after all it's open source,
> > if not avoid this kind of remark.
> 
> I comment my code, 'nuff said.
> 
> And I don't do any flavour-of-the-hour programming languages, I like safe(r)
> programming languages, like Pascal, PL/I, REXX and assembler.
LO is mainly (about 95%) done in C++, it's not what I call a flavour-of-the-hour  language like Rust (which has already have a great reputation for safety).
Assembly is near machine level so not safer at all, rather faster.
The few things I read about REXX and PL/I show they're quite a niche and not very interesting for a big project with graphical interfaces. As for Pascal, I don't know enough about it, I just know people use it to teach algorithms.



>  
> > > 
> > > And no I won't test it, as the developers of LO have zapped support for W7 -
> > > maybe they can explain why a simple office program needs all the new bloat
> > > of M$' advertising platforms pretending to be OS'es, when a program like
> > > Irfanview still runs happily on XP, despite the appearance of numerous new
> > > image formats, and IBM's z/OS can still run programs written in the 1970ies
> > > and 1980ies that have never been recompiled. (But that will still compile,
> > > once the source has been found again)
> > First release of Win7 has been in 2009 and last updates from Ms are from
> > 2020 (ref https://en.wikipedia.org/wiki/Windows_7), why should we keep on
> > maintaining Win7 compatibility?
> > 
> > Then you really compare a certainly great but small software like Irfanview
> > which runs only on Windows and has far less features than an Office suite
> > which runs on Linux, macOS and Windows?
> 
> Why can VLC still support XP SP3?
VLC is about reading/streaming/convert and also put some filters on videos. There are of course a lot of formats and it's quite a work but the "range" of features is smaller than an Office suite.

 
> 
> From their website:
> 
> "... The major maintenance effort of this release to strengthen VLC's
> overall stability as well as ***the compatibility with old releases of
> Windows and macOS was made possible by a generous sponsorship of the
> Sovereign Tech Fund (https://www.sovereign.tech/) by Germany's Federal
> Ministry for Digital Transformation and Government Modernisation.*** ..."
> 
> Why didn't LO ask for this, with the German public services on of the larger
> users of it? Why rip out the old code, making it impossible for others to
> maintain a version of older OS'es?
Perhaps TDF already did it and had no response or a negative one.


>  
> > You're talking about z/OS which is closed OS and on closed hardware
> > supported by one of the biggest company in computing
> > (https://en.wikipedia.org/wiki/Z/OS) so completely irrelevant
> > 
> > Did you noticed that Noel kept on providing some patches after Buovjaga's
> > comment? Did you at least read the commits descriptions and the
> > optimizations done?
> 
> The commits don't make any programmatical sense for people who have now
> knowledge of the internals of LO.
but fortunately since you know some safer languages, I'm quite sure you know that "Shaves 17% of the temporary allocations" in a comment is a nice improvement.


>  
> > Again, if you know how to code, you're welcome to provide some help.
> 
> I can code, but I have never had any interest in C, C++, the abomination
> that is Python, or Java. I like the readability of Pascal, PL/I and even
> assembler, where I know exactly how every instruction affects the data I'm
> processing. 
> 
I don't know about C++ but there are YT videos which show that with C, some people can "visualize" assembly generated (eg : Linus Torvalds).
Readibility depends on the language AND on the person who writes the code.
Of course it also depends on your knowledge of the language.


> > If you've got some money, you can donate to TDF or companies working on LO.
> 
> My Dutch and Brexitanian pensions add up to about eur 600 per month, and now
> that LO has abandoned W7, I'd rather spend it on more useful things, like
> taking the grandchildren out.
Nothing to say here, taking care of one's children or grandchildren is great (I mean it, there's no irony or something like that).
> 
> > If you have some time, you can contribute to translate or test LO.
> 
> I cannot, I use W7, and next to that, I'm one 90% of the people who only
> uses 10% of its features.
You can still test the 10% you know.
You seem to like about safer languages but you use a version of an OS for which there are no more security fixes, weird...

> 
> > If you don't want/can't to do any of these, just don't complain and stop the
> > despising.

In brief, you know how to program, even in some hard languages like assembly but don't want to try to review so C++ code, you don't want to donate (ok with this, I don't donate too), you don't want to contribute but still complain and despise dev's work.
I suppose you know you can just get Microsoft Office, OnlyOffice, WPSOffice or whatever, nobody forces you to stick to LO.