Bug 80659 - Non-accelerated / non-cached image scaling ...
Summary: Non-accelerated / non-cached image scaling ...
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
(earliest affected) release
Hardware: Other All
: medium normal
Assignee: Not Assigned
Whiteboard: NoRepro:
Keywords: perf
: 78529 86798 88302 114617 (view as bug list)
Depends on:
Blocks: Image-Caching Writer-Images Scrolling-PageUpDown Regressions-GraphicPrimitive2D
  Show dependency treegraph
Reported: 2014-06-29 02:06 UTC by Yousuf Philips (jay) (retired)
Modified: 2019-02-05 07:32 UTC (History)
22 users (show)

See Also:
Crash report or crash signature:

callgrind_annotate.txt (566.81 KB, text/plain)
2014-07-21 16:21 UTC, Kevin Suo
Using A4 image 300ppi (313.96 KB, application/vnd.oasis.opendocument.text)
2014-08-25 13:40 UTC, Garri
Using A4 image 150ppi (146.56 KB, application/vnd.oasis.opendocument.text)
2014-08-25 13:47 UTC, Garri
watermarked document created in MSO (309.14 KB, application/vnd.oasis.opendocument.text)
2014-09-07 13:35 UTC, Garri
callgrind (9.72 MB, application/zip)
2016-09-21 14:36 UTC, Yousuf Philips (jay) (retired)
A picture of kcachegrind ... (736.42 KB, image/png)
2016-09-21 15:15 UTC, Michael Meeks

Note You need to log in before you can comment on or make changes to this bug.
Description Yousuf Philips (jay) (retired) 2014-06-29 02:06:22 UTC
While working on bug 80552, i noticed that whenever a picture would appear in attachment 101791 [details], scrolling would become slow. In 4.2.5 and 4.3.0, slowness has increased compared to previous version (3.3, 4.0, 4.1). In prior versions, there was some slowness when the picture on page 2 appeared, but it was bearable. The worst scrolling can be seen on the last 2 pages, which is noticable in all versions.
Comment 1 Yousuf Philips (jay) (retired) 2014-06-29 23:58:24 UTC
The description was done on Linux Mint, but i also tested master on Windows 7 and it was sluggish as well.
Comment 2 Michael Stahl (allotropia) 2014-06-30 15:01:34 UTC
commit 2e5167528f7566dd9b000e50fc1610b7bf99132a

*** This bug has been marked as a duplicate of bug 78529 ***
Comment 3 Yousuf Philips (jay) (retired) 2014-06-30 16:22:34 UTC
Hi Michael,

This bug is not related to bug 78529, as there isnt a slowness when editing the document, its slow when scrolling the document. Also that bug is only found in 4.2 and above, this one is found since 3.3.0.
Comment 4 Michael Stahl (allotropia) 2014-06-30 18:28:35 UTC Comment hidden (obsolete)
Comment 5 Michael Stahl (allotropia) 2014-06-30 18:31:07 UTC Comment hidden (obsolete)
Comment 6 Yousuf Philips (jay) (retired) 2014-07-01 00:18:37 UTC Comment hidden (obsolete)
Comment 7 Michael Stahl (allotropia) 2014-07-01 11:50:02 UTC
sorry for being a bit grumpy yesterday...

ok, the 4.2 regression part is tracked by the other bug (which is
probably really "painting images is slow").

for older versions, on Fedora 20, i can't reproduce slow scrolling
in anything between OOo 3.3 and LO 4.1.6 except in LO 3.6.7
the last 2 pages were a little (but not annoyingly) slow.

perhaps my machine is too fast to reproduce this properly,
and it's more visible on a netbook / on other OS / with
other graphics drivers?
Comment 8 Yousuf Philips (jay) (retired) 2014-07-01 13:11:53 UTC
(In reply to comment #7)
> sorry for being a bit grumpy yesterday...

No problem. Sometime killing so many bugs will do that to you. Check out bug 64549 or bug 79746 if you want a good laugh.

> for older versions, on Fedora 20, i can't reproduce slow scrolling
> in anything between OOo 3.3 and LO 4.1.6 except in LO 3.6.7
> the last 2 pages were a little (but not annoyingly) slow.

Yes the last 2 pages are an easy sign to show it on any version i tested.

> perhaps my machine is too fast to reproduce this properly,
> and it's more visible on a netbook / on other OS / with
> other graphics drivers?

Here are the specs of both of the pcs i tested on, both with intel graphics
Desktop: Linux Mint 32-bit, Dual Core Intel Pentium 4 @ 3.20GHz, 4GB RAM
Laptop: Windows 7 64-bit, Intel Core 2 CPU @ 1.83Ghz, 2.5GB RAM
Comment 9 Kevin Suo 2014-07-07 10:18:38 UTC
Hi Jay, I do not reproduce the slow scrolling with version, Win XP SP3. All pages can be scrolled very fast without delay.

I remember this issue was there in the previous versions, but I do not see this problem in the newer versions rencently.

Comment 10 Kevin Suo 2014-07-21 15:42:33 UTC
Confirmed the slow down when images appear,

Ubuntu 14.04,
Build ID: 08ebe52789a201dd7d38ef653ef7a48925e7f9f7

Build ID: 9451e52b449210a83502b337c6fcc0c240daa576
TinderBox: Linux-rpm_deb-x86@45-TDF, Branch:libreoffice-4-3, Time: 2014-07-18_15:46:12
Comment 11 Kevin Suo 2014-07-21 16:21:20 UTC
Created attachment 103200 [details]

callgrind_annotate results attached.
If the 6.4MB callgrind.out.13121.tar.bz2 is needed, please let me know, I can send by email.
Comment 12 Garri 2014-08-24 17:07:01 UTC
I can confirm the problem. Tested in 4.2.6 version on multiple systems (Gentoo, Ubuntu). To reproduce the problem, you can generate a JPEG A4 300ppi image in GIMP (~300KB) and insert it into a Writer page or into a Calc sheet.

I can't reproduce the problem using PNG A4 300ppi image (~1MB) as well as using JPEG A4 150ppi image (~120KB).

Also, I'm experienced slow editing on the Writer page used to import the JPEG A4 300ppi image, so bug 78529 may be related to the issue.

While scrolling, CPU usage by soffice.bin is 100%. Installed CPU is Core i5 2.53 Mobile.

I can provide the JPEG image or complete LO file, if needed.
Comment 13 Yousuf Philips (jay) (retired) 2014-08-24 22:20:56 UTC
@Garri: please do provide your sample file as another example of this problem, so i can test that its the same as this bug. Thanks. :)
Comment 14 Garri 2014-08-25 13:40:50 UTC
Created attachment 105237 [details]
Using A4 image 300ppi

ODT file with inserted A4 300ppi JPEG image
Comment 15 Garri 2014-08-25 13:47:03 UTC
Created attachment 105238 [details]
Using A4 image 150ppi

There is no problem using the 150ppi image in file

I can confirm the same problem then the 300ppi image is opened in LO Draw.
Comment 16 Garri 2014-08-29 06:39:32 UTC
I tried to reproduce the issue using LibreOffice 3.5.7. There was no slow scrolling in 3.5.7.
Comment 17 Garri 2014-09-02 06:19:06 UTC
@Jay Philips: Were you able to reproduce the issue described in comment 12?

All three related bugs still not in confirmed status. :(
What information we can provide to solve the problem faster and efficiently? Are we looking for exact commit introduced the regression?

Frankly, the problem is very important. It requires to modify many existing production documents. Migration from MSO also became more difficult, as many users began to complain that MSO performs better on same hardware. :(
Comment 18 Garri 2014-09-04 05:23:54 UTC
I've built LO (Version:, Build ID: 57c54792781cc9befe3a65a97b37fa85f89da0ae). The revision used just before the commit 2e5167528f7566dd9b000e50fc1610b7bf99132a (mentioned in comment 2). Same problem in built version.

Can anyone suggest not affected 4.x version? I'll try to find the commit introduced regression. Thanks.
Comment 19 Yousuf Philips (jay) (retired) 2014-09-04 11:31:34 UTC
Hi Garri,

Sorry for the delay. So i checked and your file is related to bug 78529, as its a 4.2.x regression and not this one. :)
Comment 20 Garri 2014-09-05 10:56:53 UTC
Hello Jay!

Here is my new findings:

When I open sample_300ppi.odt in LO versions (Build ID: 57c54792781cc9befe3a65a97b37fa85f89da0ae) and, I experience problems described in bug 78529 and bug 80659:

1. Document scrolling becomes very slow then a picture appears (80659).
2. There is a delay before characters appear on the screen after the user input (78529).

When I open sample_300ppi.odt in LO version
(Build ID: 40ff705089295be5be0aae9b15123f687c05b0a), I don't experience the problems, nor 78529, neither 80659.

So, based on my results, regression commit lies between 40ff705089295be5be0aae9b15123f687c05b0a and 57c54792781cc9befe3a65a97b37fa85f89da0ae.
Comment 21 Garri 2014-09-06 01:25:06 UTC Comment hidden (obsolete)
Comment 22 Garri 2014-09-06 01:26:39 UTC
(In reply to comment #21)
> The search area narrowed, not it is between
> 8b716072410bcfd252739fb953d5ac198e27a895..
> 57c54792781cc9befe3a65a97b37fa85f89da0ae.

Sorry, I mean now it is between 8b716072410bcfd252739fb953d5ac198e27a895..57c54792781cc9befe3a65a97b37fa85f89da0ae.
Comment 23 Garri 2014-09-06 05:29:05 UTC
Bisecting: 1894 revisions left to test after this (roughly 11 steps)
[d63a69a087c9c7641e28e2002d7ad56076d08ca1] fix string

I'll try to complete bisect on the weekends. :)
Comment 24 Yousuf Philips (jay) (retired) 2014-09-06 10:48:08 UTC
Hi Garri,

You should submit all of your findings to bug 78529, as they are unrelated to this bug.
Comment 25 Garri 2014-09-06 12:35:16 UTC
(In reply to comment #24)
> Hi Garri,
> You should submit all of your findings to bug 78529, as they are unrelated
> to this bug.

Hi Jay,

What I am doing wrong? I experience problem described in this bug report, slow scrolling when images appear, found only in 4.2.x. And I trying to find regression in 4.2.x.

Jay, feel free to correct me, if I'm wrong. Thank you.
Comment 26 Garri 2014-09-06 12:47:18 UTC
Jay, I understood. :)

You mean, this bug describes slow scrolling in versions since 3.3.0 (general scrolling slowness).

Ok, I'll post regression commit in related report. Thank you!
Comment 27 Garri 2014-09-07 13:35:42 UTC
Created attachment 105859 [details]
watermarked document created in MSO

Hi Jay, can you check the attachment? It is a watermarked document created using MSO. I experience slow scrolling in the document using the LO build 57c54792781cc9befe3a65a97b37fa85f89da0ae (which not affected by bug 78529). Thank you!
Comment 28 QA Administrators 2016-09-20 10:25:37 UTC Comment hidden (obsolete)
Comment 29 Garri 2016-09-20 20:53:59 UTC
The problem still exists in Usage of my "Intel(R) Core(TM) i5 CPU M 460  @ 2.53GHz" is 100% for all 4 cores while I'm scrolling the attachment 105859 [details].
Comment 30 Michael Meeks 2016-09-20 21:13:06 UTC
Hmm - so ... interesting 4.2x is before any OpenGL work; I am encouraged that instead of being 1 CPU at 100% that (apparently) our/my multi-threaded image scaling is now maxing out four of your CPUs ;-) which should make things around 4x faster.

I imagine this is yet-another artifact of the drawing-layer's love of re-interpolating huge images per frame or part of frame rendered - but hard to know without a profile.

Please remove NEEDINFO when there is a callgrind profile taken for a build with debuginfo installed - that is >= LibreOffice 5.1.0.

Thanks !
Comment 31 Kevin Suo 2016-09-21 02:23:25 UTC
(In reply to Michael Meeks from comment #30)

I do not reproduce with the following version:
Build ID: 31dd62db80d4e60af04904455ec9c9219178d620
CPU Threads: 4; OS Version: Linux 4.4; UI Render: default; 
Locale: zh-CN (zh_CN.UTF-8); Calc: group
Ubuntu 16.04 LTS X64.

@Jay: Could you please confirm?
Comment 32 Yousuf Philips (jay) (retired) 2016-09-21 04:35:22 UTC
@Kevin: Yes i still see it on my Core 2 Duo (used to have a Pentium 4) on the last 2 pages, and can max out both CPU cores to ~100% scrolling up and down between the two pages.

Gonna try doing a callgrind.

Build ID: 3287bc2f91438085b7604773d5e0346fc3c3f452
CPU Threads: 2; OS Version: Linux 3.19; UI Render: default; 
TinderBox: Linux-rpm_deb-x86_64@70-TDF, Branch:master, Time: 2016-09-18_06:17:20
Locale: en-US (en_US.UTF-8); Calc: group
Comment 33 Yousuf Philips (jay) (retired) 2016-09-21 14:36:01 UTC
Created attachment 127516 [details]
Comment 34 Yousuf Philips (jay) (retired) 2016-09-21 14:40:24 UTC
@meeks: Ball in your corner now. :D
Comment 35 Michael Meeks 2016-09-21 15:15:21 UTC
Thanks for the trace - as expected this looks like it is down to the drawing layer re-scaling images very regularly; and I imagine that we re-scale the entire image no matter how little of it we want to actually render too ;-) [ a better VCL rendering API / Impl. might improve that ? ] Either way - Armin is the expert here.

I see 39bn pcycles in your trace - of which:

18.7bn are from scaleNonPaletteGeneral2 - on GetPixelForN32BitTcBgra - which we could write an accelerated C++ version for I guess.

This is from the threaded method (why you have 4 CPUs at 100% ;-)

And another 8.9bn pcycles from the main thread - doing some of that work too.

I attach a pretty picture of where that comes from. It is possible that by intersecting with the VCL clipping region we could do something more optimal and quick here I guess wrt. reducing the amount of image data we operate on.

Comment 36 Michael Meeks 2016-09-21 15:15:47 UTC
Created attachment 127522 [details]
A picture of kcachegrind ...
Comment 37 Armin Le Grand 2016-09-22 08:28:20 UTC
We could poke around vcl here, all platforms except Linux already use HW-Scaling of bitmaps - this should simply be available nowadays, even on Linux. At least the orig pix data is now handed over so that the quality/scaling decision is possible inside the low-level part at all. Before SW scaled somewhere (with bad quality) and used that - not any control below.
If it is still not possible to use HW-scaling on Linux I would evtl. try to add a '#ifdef UNX'ed buffered even-more-low-level bitmap primitive like 'AlreadyScaledBitmapOne' that is buffered/created automatically in the low-level BitmapPrimitive2D (which is minimal in holding the orig pixel data and the transformation)
Comment 38 Armin Le Grand 2016-09-22 08:33:11 UTC
It may also be worth to use Cairo - I already measured that it's faster for FatLines. I also checked that it is slower for colored PolyPolygons. It might be faster with Bitmap painting - we should try that instead of again and aigain trying to optimize own bitmap scaling - this should simply not be done in a graphics stack anymore by hand, at least not for paining an EditView
Comment 39 Armin Le Grand 2016-09-29 15:15:57 UTC
Probably double to Bug 78529
Comment 40 Michael Meeks 2016-09-30 07:53:10 UTC
*** Bug 88302 has been marked as a duplicate of this bug. ***
Comment 41 Michael Meeks 2016-09-30 07:58:31 UTC
Removing the Linux specific piece here - see bug#88302 - which appears to shows the same result on Windows ( I guess in the GDI fallback paths ).

Also - AFAICS we really should consider caching the scaled images; most images are rendered only once, at one size - and at one zoom level - and relying on hardware to continuously do high quality interpolations is not great - even with GL acceleration - once we get to a largeish scale-factor, we have to do multi-pass scaling which is poorer quality and performance; IMHO we should be caching.

I snip some hearsay / design bits from a friend:

The best would be to cache the scaled down version somewhere around SwNoTextFrame::PaintPicture

paintGraphicUsingPrimitivesHelper begins the chain of the slow path; and yet we are highlevel enough to have some good insight into the management of the bitmaps etc.

Like, already here we know what outputdevice is targeted in the end, so we can do the scaling here (so that it fits the target outdev), and remember the scaled bitmap somewhere - probably directly in the pGrfNd (together with the information about the target outputdevice).

And use the scaled-down version if the outputdevice has the same settings (like resolution and stuff) instead of calling drawinglayer with the verbatim pGrfNd

Would love feedback on that.
Comment 42 Michael Meeks 2016-09-30 08:00:35 UTC
*** Bug 78529 has been marked as a duplicate of this bug. ***
Comment 43 Armin Le Grand 2016-09-30 08:57:30 UTC
Yes, it should be cached, but please not in the app (Writer) as it was in the elder days. The orig bm data has to be available in the render stack, so that hw scaling can use it if available.
Thus there are two places to cache this:
(a) In the primitive stack
(b) in the sys-dependent part of GDI
I would (of course) opt for (a) since I know that would work (sys-independent). In principle it is about:
- Add a decomposition (the scaled bitmap or part of it) to the bitmap primitive
- Use a even-more-lo-level Scaled/Buffered/Cahed/BitmapPrimitive to hold this
- do intelligent caching (only needed stuff, re-use as long as cerain zoom did not change too much, flush when using too much mem, ...)
- In a primitive renderer, use the bitmap primitive if HW scale is available, the (cached) decomposition else (sys-dependent flag?)
- Make use of mechanisms supporting this in the primitive stack

BTW: I checked and indeed even on Win still some paths go to sw-scaling - sigh - but that can be corrected. This is a argument to think about sys- and task-specific renderers, e.g. a *direct* GdiPlus-one for Win, *not* taking the path over VCL at all. Think about a big primitive renderer factory which you give info what you want (win and screen render) -> gives you the primitie renderer for GdiPlus and Win. Same for Linux, same for online, same for headless, same for PDF export (ah dreaming...)
Comment 44 Telesto 2016-11-11 18:01:53 UTC
For what it's worth: MacOS is also affected.

Build ID: 6984fd5a756f1e01e94da14f01df5a0e20791630
CPU Threads: 4; OS Version: Mac OS X 10.12.1; UI Render: default; Layout Engine: new; 
TinderBox: MacOSX-x86_64@49-TDF, Branch:master, Time: 2016-11-05_02:01:59
Locale: en-US (en_US.UTF-8); Calc: group
Comment 45 Jean-Baptiste Faure 2017-12-23 17:47:12 UTC
*** Bug 114617 has been marked as a duplicate of this bug. ***
Comment 46 Telesto 2018-02-14 14:49:12 UTC
*** Bug 104296 has been marked as a duplicate of this bug. ***
Comment 47 Telesto 2018-02-14 14:55:09 UTC
*** Bug 86798 has been marked as a duplicate of this bug. ***
Comment 48 Paul Menzel 2018-02-14 15:38:57 UTC
Reproduced with LibreOffice on GNU/Linux with GTK 3.22.26.
Comment 49 Telesto 2018-02-14 15:45:42 UTC
*** Bug 113038 has been marked as a duplicate of this bug. ***
Comment 50 Roland Baudin 2018-02-15 08:23:14 UTC
I see the same slowness when scrolling pictures of moderated size (10 cm x 10 cm, 200 dpi, PNG).

Things can be improved a bit by using the maximum CPU frequency all the time.

My system:

Ubuntu 16.04.3 LTS
Comment 51 Xisco Faulí 2018-07-17 19:52:58 UTC
It seems pretty good in

Build ID: 4d18cd6aad0daaefaca792e8eac173bea07f3750
CPU threads: 4; OS: Linux 4.13; UI render: default; VCL: gtk3; 
Locale: ca-ES (ca_ES.UTF-8); Calc: group threaded

Could anyone else confirm ?
Comment 52 Stéphane Aulery 2018-10-29 15:58:05 UTC
It seems also pretty good in

Version: (x64)
Build ID: 0c292870b25a325b5ed35f6b45599d2ea4458e77
Threads CPU : 2; OS : Windows 6.1; UI Render : par défaut; 
Locale : fr-FR (fr_FR); Calc: group
Comment 53 Buovjaga 2019-02-05 07:32:23 UTC
Make it three people who have said there is no slowness while scrolling past images in attachment 101791 [details]. Let's close.

Arch Linux 64-bit
Build ID: 8fbad2f600cd3ab81e7c1da0e4a2a71ebcac0553
CPU threads: 8; OS: Linux 4.20; UI render: default; VCL: gtk3; 
Locale: fi-FI (fi_FI.UTF-8); UI-Language: en-US
Calc: threaded
Built on 31 January 2019