Bug 129675 - Rendering EMF image embedded into a presentation slide uses too much memory
Summary: Rendering EMF image embedded into a presentation slide uses too much memory
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: graphics stack (show other bugs)
Version:
(earliest affected)
6.0.0.3 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: bibisected, bisected, filter:emf, regression
Depends on:
Blocks: emf-testbed EMF-WMF
  Show dependency treegraph
 
Reported: 2019-12-28 20:57 UTC by Cesar Eduardo Barros
Modified: 2020-09-24 12:41 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments
Sample PPTX (2.67 MB, application/vnd.openxmlformats-officedocument.presentationml.presentation)
2019-12-28 22:06 UTC, Timur
Details
Sample PPTX slides 18-20 (789.38 KB, application/vnd.openxmlformats-officedocument.presentationml.presentation)
2019-12-28 22:45 UTC, Timur
Details
image5.emf - file that is causing the grief (1.74 MB, image/emf)
2019-12-29 15:54 UTC, Chris Sherlock
Details
Sample PPTX slide 19 compared (341.24 KB, image/png)
2020-09-22 08:23 UTC, Timur
Details
Sample PPTX slide 20 compared (303.71 KB, image/png)
2020-09-22 08:23 UTC, Timur
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Cesar Eduardo Barros 2019-12-28 20:57:53 UTC
Description:
Going past slide 19 on this presentation, either in the full-screen presentation view or even just scrolling past its thumbnail in the sidebar, consumes more than 16GB of RSS, going heavily into swap (thrashing). Every time, I managed to kill it before it reached the OOM killer, but the window manager was already slowed down by the thrashing (taking many seconds to react to every action).

Steps to Reproduce:
1. Get the file "pmo janeiro 2020_NEWAVE.pptx" from http://www.ons.org.br/AcervoDigitalDocumentosEPublicacoes/APRESENTACOES_PMO_202001.zip
2. Open the file
3. Try to go past slide 19 (the one with a map of Brazil), either in full-screen presentation mode, or scrolling with the mouse wheel on the left-hand slide list

Actual Results:
Libreoffice locks up allocating huge amounts of memory, soon exhausting the 16 GB RAM on this machine, and then the whole computer locks up due to heavy thrashing. Libreoffice can still be force-killed through the window manager or the terminal to bring the system back to normal.

Expected Results:
It should allocate a normal amount of memory.


Reproducible: Always


User Profile Reset: No


OpenGL enabled: Yes

Additional Info:
The exact version is libreoffice-impress-6.3.3.2-7.fc31.x86_64 (Fedora 31 package).
Comment 1 Timur 2019-12-28 22:06:49 UTC
Created attachment 156821 [details]
Sample PPTX

Not correct from reporter to use lazy man's approach and give link with many presentations, instead to attach a single one. I do it now. 
Repro also in Windows with 6.5+.
Comment 2 Cesar Eduardo Barros 2019-12-28 22:15:05 UTC
(In reply to Timur from comment #1)
> Created attachment 156821 [details]
> Sample PPTX
> 
> Not correct from reporter to use lazy man's approach and give link with many
> presentations, instead to attach a single one. I do it now. 
> Repro also in Windows with 6.5+.

I didn't attach the document because the bug reporting form said "If you do not possess or cannot create suitable test documents that may be released under our licensing terms, please make a note in your bug report to this effect without attaching the file", and I don't know the license that presentation is under, so according to that sentence I shouldn't attach it. Instead, I gave the original link I got that file from.
Comment 3 Timur 2019-12-28 22:45:19 UTC
Created attachment 156822 [details]
Sample PPTX slides 18-20

just 3 slides I reproduce issue with.
Comment 4 V Stuart Foote 2019-12-29 02:12:13 UTC
Hey Chris, *

Opening the embedded EMF from attachment 156822 [details] really drags. Paint, irfanview, etc. have no issue--so seems like we are getting stuck in a recursive rendering loop of some flavor. ImageMagik shows it a sRGB EMF with Alfa nothing seems out of sorts.

It just sucks memory as it loads, either in Draw for the EMF extracted from archive, or if opening the presentation. 

Converting the PPTX to ODP does notihing for it.

One for your testbed?
Comment 5 V Stuart Foote 2019-12-29 02:18:31 UTC
On Windows 10 Home 64-bit en-US (1909)
Hangs up consuming lots of memory in
Version: 6.4.0.1 (x64)
Build ID: 1b6477b31f0334bd8620a96f0aeeb449b587be9f
CPU threads: 4; OS: Windows 10.0 Build 18362; UI render: GL; VCL: win; 
Locale: en-US (en_US); UI-Language: en-US
Calc: threaded

Slide with EMF opens cleanly with no delay opening with a recent master...
Version: 6.5.0.0.alpha0+ (x64)
Build ID: 0640bbac3c0b9e51e659c1d2b86d9a79a6dfa225
CPU threads: 4; OS: Windows 10.0 Build 18362; UI render: GL; VCL: win; 
Locale: en-US (en_US); UI-Language: en-US
Calc: threaded
Comment 6 Chris Sherlock 2019-12-29 15:51:02 UTC
I have extracted image5.emf and it's crashing my master build (albeit one with a few extra logging statements...).
Comment 7 Chris Sherlock 2019-12-29 15:52:44 UTC
soffice.bin: /home/chris/repos/libreoffice-latest/basegfx/source/polygon/b2dlinegeometry.cxx:701: basegfx::B2DPolygon basegfx::(anonymous namespace)::createAreaGeometryForJoin(const basegfx::B2DVector &, const basegfx::B2DVector &, const basegfx::B2DVector &, const basegfx::B2DVector &, const basegfx::B2DPoint &, double, basegfx::B2DLineJoin, double, basegfx::triangulator::B2DTriangleVector *): Assertion `(fHalfLineWidth > 0.0) && "createAreaGeometryForJoin: LineWidth too small (!)"' failed.
Comment 8 Chris Sherlock 2019-12-29 15:54:27 UTC
Created attachment 156837 [details]
image5.emf - file that is causing the grief
Comment 9 Chris Sherlock 2019-12-29 16:08:34 UTC
OK, seems unrelated and will only happen in debug mode. There appear to be some 0 width lines in that file. 

I'm not seeing memory issues when I check this file on master. Might be nice to find out what it has lines of width 0, but seems to be stable.
Comment 10 Chris Sherlock 2019-12-29 16:11:46 UTC Comment hidden (me-too)
Comment 11 Chris Sherlock 2019-12-29 16:20:31 UTC
Sorry, mind was on the logs I was reviewing.... that last comment should have read:

"I should note that last comment was on opening the EMF file I extracted. It does take a little longer than I would have expected for it to load the pptx file, but once loaded I can scroll around fine."
Comment 12 V Stuart Foote 2019-12-29 17:23:29 UTC
The .EMF was handled (no delay) at 5.3.7.2 inserting the extracted .EMF

But the EMF is mishandled in 6.3 and 6.4 including current 6.4.0.1 rc1 release. Didn't check 6.0, 6.1, or 6.2 (don't have them to check against).

But it is now again handled well in current (after 2019-12-21) master/6.5.0

@Chris, so looks like your recent EMF work on master/6.5.0 has fixed whatever the issue is with this EMF. 

Somewhere in the range (TB77 builds 2019-12-14 --> 2019-12-21):

https://cgit.freedesktop.org/libreoffice/core/log/?qt=range&q=113444f59dc7690850919155b9b164b1a686bbe7..209fc9fd7fa433947af0bf86e210d73fa7f5a045

@Xisco, 'reverse' bibisect? And, your  call on the 'regression' tag.
Comment 13 Chris Sherlock 2019-12-30 11:50:07 UTC
I'm doing a bisect now to see if I can work out what was causing the issue.
Comment 14 Chris Sherlock 2019-12-30 12:32:38 UTC
I did a reverse bisect, looks like I did inadvertently fix this. 

git bisect start
# good: [113444f59dc7690850919155b9b164b1a686bbe7] sc: rowcol: tdf#50916 create ScSheetLimits to hold by rtl::Reference
git bisect good 113444f59dc7690850919155b9b164b1a686bbe7
# bad: [209fc9fd7fa433947af0bf86e210d73fa7f5a045] Add case table for Deseret and Osage
git bisect bad 209fc9fd7fa433947af0bf86e210d73fa7f5a045
# good: [189d498f5c0c633f8cd87b3f1b6d57020371a952] tdf#128671: Rely on unwind.h, declare what's missing from cxxabi.h
git bisect good 189d498f5c0c633f8cd87b3f1b6d57020371a952
# good: [e3a002c53a544de02e5119c5b0a2fd4f972156a8] get native gtk widgets in sidebars working
git bisect good e3a002c53a544de02e5119c5b0a2fd4f972156a8
# bad: [5f69e451a01e92ff37bc26805b0bbf3663f60575] loplugin:duplicate-defines
git bisect bad 5f69e451a01e92ff37bc26805b0bbf3663f60575
# bad: [e59bbb72b1145e4865742c5f03d9372a177b9df9] Resolves: tdf#129484 just install decimal key handler for spinbuttons
git bisect bad e59bbb72b1145e4865742c5f03d9372a177b9df9
# good: [fdb5ce011cb043475869d0b607ea25b8f32b4314] tdf#108458 Show tooltips in CuiConfigFunctionListBox
git bisect good fdb5ce011cb043475869d0b607ea25b8f32b4314
# bad: [011dda766ec850d99783c0bd90a5c535c5d113c4] destroy SystemChildWindow at the right stage
git bisect bad 011dda766ec850d99783c0bd90a5c535c5d113c4
# good: [aa59b0983061d344224986aa044b6ebd3ca218af] drawinglayer: improve more EMF+ logging
git bisect good aa59b0983061d344224986aa044b6ebd3ca218af
# bad: [2d46f14fa0ef555069795bd4e889b6871e7ce943] drawinglayer: better logging for brushes in EmfPlusRecordTypeDrawString
git bisect bad 2d46f14fa0ef555069795bd4e889b6871e7ce943
# bad: [1bd303a4c38a1bc04c3cf7bf0e7a44ac0fdb209d] drawinglayer: improve pen object logging
git bisect bad 1bd303a4c38a1bc04c3cf7bf0e7a44ac0fdb209d
# first bad commit: [1bd303a4c38a1bc04c3cf7bf0e7a44ac0fdb209d] drawinglayer: improve pen object logging

Commit is:

commit 1bd303a4c38a1bc04c3cf7bf0e7a44ac0fdb209d (HEAD, refs/bisect/bad)
Author: Chris Sherlock <chris.sherlock79@gmail.com>
Date:   Tue Dec 10 03:04:18 2019 +1100

    drawinglayer: improve pen object logging
    
    Change-Id: Iaae081ddee8097346000b7c2d987a2321d5e98cd
    Reviewed-on: https://gerrit.libreoffice.org/84833
    Tested-by: Jenkins
    Reviewed-by: Bartosz Kosiorek <gang65@poczta.onet.pl>


I'm going to be quite frank here - I can't see how. This was merely me updating logging, and defining some constants to make the code easier to work with!

But we can mark this as resolved.
Comment 15 Cesar Eduardo Barros 2019-12-30 15:05:01 UTC
(In reply to Chris Sherlock from comment #14)
> I did a reverse bisect, looks like I did inadvertently fix this. 
> 
> Commit is:
> 
> commit 1bd303a4c38a1bc04c3cf7bf0e7a44ac0fdb209d (HEAD, refs/bisect/bad)
> Author: Chris Sherlock <chris.sherlock79@gmail.com>
> Date:   Tue Dec 10 03:04:18 2019 +1100
> 
>     drawinglayer: improve pen object logging
>     
>     Change-Id: Iaae081ddee8097346000b7c2d987a2321d5e98cd
>     Reviewed-on: https://gerrit.libreoffice.org/84833
>     Tested-by: Jenkins
>     Reviewed-by: Bartosz Kosiorek <gang65@poczta.onet.pl>
> 
> 
> I'm going to be quite frank here - I can't see how. This was merely me
> updating logging, and defining some constants to make the code easier to
> work with!
> 
> But we can mark this as resolved.

Looking very carefully at that commit, it does have a functional change. The "pen has a custom dash line" case was changed from EmfPlusPenDataDashedLine (0x00000100) to EmfPlusPenDataMiterLimit (0x00000010).

So my guess is that the bug was in that "pen has a custom dash line" case, and this change made this particular EMF no longer reach that case. I know nothing about EMF, so I have no idea whether this change was correct or not (and if it was incorrect, fixing it might expose this high memory usage bug again).
Comment 16 Timur 2020-01-01 17:15:37 UTC
Please explain how can this be fixed with commit from Dec 10 when I still see Not Responding LO and system instability with LO from Dec 20?
Comment 17 V Stuart Foote 2020-01-01 17:38:15 UTC
@Timur, no check the commit dates not the author date--it went in on the 20th.

Was OK with a build of master/6.5.0 from the 21st, and onward.

But, as Cesar notes it could still be an issue if we are just not parsing a dashed line.  Looking at a stacktrace dump while hung up in rendering--it was happening while parsing a dashed line.

Assume Chris will have a look at Cesar's note.
Comment 18 Timur 2020-01-03 13:40:14 UTC
Thank you, this is the improvement, no "Nor Responding" anymore.
Fine without OpenGL.

I still can't pass slide 19 (that is slide 2 in my 3-slide sample) and see slide 20 if OpenGL enabled (HW-ACC regardless), presentation just ends.
Comment 19 Timur 2020-01-03 14:21:27 UTC
(In reply to Timur from comment #18)
> I still can't pass slide 19 (that is slide 2 in my 3-slide sample) and see
> slide 20 if OpenGL enabled (HW-ACC regardless), presentation just ends.

Actually, if I wait enough and not click again to advance slide, I may see slide 20.
That wrong behavior with OpenGL seems to have started in 6.0, was good in 5.4.7.
Looks like a separate bug.
Comment 20 Xisco Faulí 2020-01-07 10:44:07 UTC
(In reply to Chris Sherlock from comment #14)
> 
> Commit is:
> 
> commit 1bd303a4c38a1bc04c3cf7bf0e7a44ac0fdb209d (HEAD, refs/bisect/bad)
> Author: Chris Sherlock <chris.sherlock79@gmail.com>
> Date:   Tue Dec 10 03:04:18 2019 +1100
> 
>     drawinglayer: improve pen object logging
>     
>     Change-Id: Iaae081ddee8097346000b7c2d987a2321d5e98cd
>     Reviewed-on: https://gerrit.libreoffice.org/84833
>     Tested-by: Jenkins
>     Reviewed-by: Bartosz Kosiorek <gang65@poczta.onet.pl>
> 
> 
> I'm going to be quite frank here - I can't see how. This was merely me
> updating logging, and defining some constants to make the code easier to
> work with!
> 
> But we can mark this as resolved.

I do confirm the issue is fixed by the mentioned commit.
However, I'm a bit concern about Cesar's comment in comment 15.
The condition was changed from

else if (pen->penDataFlags & 0x00000100) // pen has a custom dash line

to 

const sal_uInt32 EmfPlusPenDataMiterLimit = 0x00000010;
else if (pen->penDataFlags & EmfPlusPenDataMiterLimit) // pen has a custom dash line

which seems wrong to me. I believe it should be

else if (pen->penDataFlags & EmfPlusPenDataDashedLine) // pen has a custom dash line

@Chris ?
Comment 21 V Stuart Foote 2020-01-12 16:52:41 UTC
Careful here...

The bibisect for bug 129816 shows the issue likely came in at 6.0 with
https://cgit.freedesktop.org/libreoffice/core/commit/?id=ebc11ae0b132eefd3b1b1a837a8d0ad3ba73b460 

Careful as it may not actually be 'fixed', and is just masked (class of dashed lines not being rendered) in currenent masters > 2019-12-20 with
the reverse bibisect Chris did showing https://cgit.freedesktop.org/libreoffice/core/commit/?id=1bd303a4c38a1bc04c3cf7bf0e7a44ac0fdb209d


A WinDbg trace (master at 6.4 branch not OpenGL rendered) just opening the EMF+ shows it chews memory parsing a dashed line:

0a 00000071`0eb8c8e0 00007fff`7bc6ea07 ucrtbase!_malloc_base+0x36
0b 00000071`0eb8c910 00007fff`795da256 mergedlo!xstor_component_getFactory+0x1af287
0c 00000071`0eb8c940 00007fff`795d874e mergedlo!basegfx::B2DPolygon::operator!=+0x616
0d 00000071`0eb8c970 00007fff`795df63f mergedlo!basegfx::utils::createAreaGeometryForLineStartEnd+0x379e
0e 00000071`0eb8c9b0 00007fff`795db619 mergedlo!basegfx::B2DPolygon::setNextControlPoint+0x19f
0f 00000071`0eb8ca00 00007fff`795eaece mergedlo!basegfx::B2DPolygon::appendBezierSegment+0x1a9
10 00000071`0eb8ca90 00007fff`79a07c64 mergedlo!basegfx::utils::applyLineDashing+0x36e
11 00000071`0eb8cc70 00007fff`799ebfb6 mergedlo!drawinglayer::primitive2d::PolygonStrokePrimitive2D::create2DDecomposition+0xe4
12 00000071`0eb8cd80 00007fff`79a3d7db mergedlo!drawinglayer::primitive2d::BufferedDecompositionPrimitive2D::get2DDecomposition+0xa6
Comment 22 Commit Notification 2020-02-17 19:31:55 UTC
Xisco Fauli committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/2cdab992c6faeb98a1cbef7615f4cc7ce0d3f04d

tdf#129675: This should be EmfPlusPenDataDashedLine

It will be available in 7.0.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 23 V Stuart Foote 2020-02-17 20:39:05 UTC
Kind of expect that Armin's work [1] on bug 130478 will end up the real fix for this dashed line mishandling in EMF+. 

Will check when a build with both rolls...

=-ref-=
[1] https://gerrit.libreoffice.org/c/core/+/88463
Comment 24 V Stuart Foote 2020-03-07 08:36:26 UTC
Got a chance to test this with tb77 back up posting Windows builds. 

No hang with current master, but the dashed arcs are not being drawn dashed. So not quite correct yet.

See attachment 158461 [details]

=-testing-=

Testing Windows 10 64-bit en-US (1909) nVidia GTX 750ti
Version: 7.0.0.0.alpha0+ (x64)
Build ID: 10e20a77ce302a0475a661ad1886f2ca83c55f3f
CPU threads: 8; OS: Windows 10.0 Build 18363; UI render: Skia/Raster (or Vulkan); VCL: win; 
Locale: en-US (en_US); UI-Language: en-US
Calc: CL
Comment 25 V Stuart Foote 2020-03-10 13:33:56 UTC
(In reply to V Stuart Foote from comment #24)

> No hang with current master, but the dashed arcs are not being drawn dashed.
> So not quite correct yet.
> 

On bug 130478 Armin noted rendering differs between vcl backends. 

Payed more attention and opened attachment 156837 [details] in each of the rendering modes.

So while it does not hang nor produce dashed lines with default GDI (CPU or Hardware Accelerated), or with skia (Vulkan or Raster) -- it is *again choking* with still available OpenGL rendering. 

I guess that is expected? Dashed line handling for bug 130478 has not been switched in and OpenGL rendering is using the prior mishandled (recursive) dashed line rendering?
Comment 26 Timur 2020-09-22 08:23:26 UTC
Created attachment 165756 [details]
Sample PPTX slide 19 compared

After bug 136836 I was retesting this and I can't say what exactly is the problem. 
I attach slide 19 compared.
Please explain. 
And for reporter, please retest with master from https://dev-builds.libreoffice.org/daily/master/current.html.
Comment 27 Timur 2020-09-22 08:23:54 UTC
Created attachment 165757 [details]
Sample PPTX slide 20 compared
Comment 28 Cesar Eduardo Barros 2020-09-24 12:41:16 UTC
(In reply to Timur from comment #26)
> Created attachment 165756 [details]
> Sample PPTX slide 19 compared
> 
> After bug 136836 I was retesting this and I can't say what exactly is the
> problem. 
> I attach slide 19 compared.
> Please explain. 

This is a performance (memory use) issue, so a visual comparison should not show any difference.