Bug 160590 - Impress crashes with skia Metal enabled, skia raster software rendering works (MacOS Monterey (12.7.4) w/Intel HD Graphics 6000)
Summary: Impress crashes with skia Metal enabled, skia raster software rendering works...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: graphics stack (show other bugs)
Version:
(earliest affected)
24.2.1.2 release
Hardware: x86-64 (AMD64) macOS (All)
: medium normal
Assignee: Patrick (volunteer)
URL:
Whiteboard: target:24.8.0 target:24.2.3.2
Keywords:
Depends on:
Blocks: Skia-macOS
  Show dependency treegraph
 
Reported: 2024-04-08 20:07 UTC by Giuseppe S.
Modified: 2024-06-17 11:31 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
Apple crash report (57.75 KB, text/plain)
2024-04-08 20:07 UTC, Giuseppe S.
Details
Apple Activity Monitor sampling 2024-04-13 LibreOfficeDev, internal LCD (505.52 KB, text/plain)
2024-04-13 19:46 UTC, Giuseppe S.
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Giuseppe S. 2024-04-08 20:07:52 UTC
Created attachment 193575 [details]
Apple crash report

Hello,
I am trying all libreoffice options in order to get more performance on my MacBook Air. Every time I activate skia and remove the software rendering, the application crashes.

This is my hardware details:
Model: MacBookAir7,2, BootROM 489.0.0.0.0, 2 processors, Dual-Core Intel Core i5, 1,8 GHz, 8 GB, SMC 2.27f2
Graphics: Intel HD Graphics 6000, Intel HD Graphics 6000, Built-In
Display: Color LCD, 1440 x 900 (Widescreen eXtended Graphics Array Plus), Main, MirrorOff, Online


The Apple crash report (attached to this bug) displays this stack trace:

thread 0::  Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib        	    0x7ff80f99993a mach_msg_trap + 10
1   libsystem_kernel.dylib        	    0x7ff80f999ca8 mach_msg + 56
2   IOKit                         	    0x7ff812389a7c io_connect_method + 387
3   IOKit                         	    0x7ff812389889 IOConnectCallMethod + 186
4   IOAccelerator                 	    0x7ff81837898f ioAccelResourceFinalize + 165
5   CoreFoundation                	    0x7ff80fb6a046 _CFRelease + 244
6   Metal                         	    0x7ff8183a7e62 -[MTLIOAccelResource dealloc] + 278
7   Metal                         	    0x7ff8183a7d3f -[MTLIOAccelBuffer dealloc] + 333
8   libskialo.dylib               	       0x10cae954b GrMtlBuffer::onRelease() + 43
9   libskialo.dylib               	       0x10c9e8a3f GrGpuResource::release() + 15
10  libskialo.dylib               	       0x10c9fb647 GrResourceCache::notifyARefCntReachedZero(GrGpuResource*, GrIORef<GrGpuResource>::LastRemovedRef) + 567
[...]

Thank you,
Giuseppe
Comment 1 V Stuart Foote 2024-04-08 21:48:28 UTC
Meaning probably that Skia Metal accelerated graphics are not supported on your GPU. Actually rather common and why the raster framed "software" rendering is fallback/default.

IMHO if the Skia vector GPU rendering fails, NOT OUR BUG. Hardware, Driver or a mix of the two.
Comment 2 Patrick (volunteer) 2024-04-08 23:07:12 UTC
(In reply to V Stuart Foote from comment #1)
> Meaning probably that Skia Metal accelerated graphics are not supported on
> your GPU. Actually rather common and why the raster framed "software"
> rendering is fallback/default.

So far, we only disable Skia/Raster for the following 2 GPU families:

AMD Radeon Pro 5300M
AMD Radeon Pro 5500M

Reading the crash log in attachment #193575 [details], it is LibreOffice's "Skia hang detector" code that is forcing LibreOffice to crash. That code monitors batches of calls into Skia and, if a batch doesn't finish within a few seconds or so, that code forces LibreOffice to crash.

So, I assume that Skia/Metal has one or more very, very slow or hanging operations on the Intel HD Graphics 6000 GPU. Should I add that to the above "disable Skia/Raster" list?
Comment 3 Giuseppe S. 2024-04-09 06:36:25 UTC Comment hidden (off-topic)
Comment 4 V Stuart Foote 2024-04-09 11:25:49 UTC
(In reply to Patrick Luby (volunteer) from comment #2)
> 
> So far, we only disable Skia/Raster for the following 2 GPU families:
> 
> AMD Radeon Pro 5300M
> AMD Radeon Pro 5500M
> 
> Reading the crash log in attachment #193575 [details], it is LibreOffice's
> "Skia hang detector" code that is forcing LibreOffice to crash. That code
> monitors batches of calls into Skia and, if a batch doesn't finish within a
> few seconds or so, that code forces LibreOffice to crash.
> 
> So, I assume that Skia/Metal has one or more very, very slow or hanging
> operations on the Intel HD Graphics 6000 GPU. Should I add that to the above
> "disable Skia/Raster" list?

Yes, that seems appropriate, to add the Intel HD Graphics 6000 stanza to vcl/quartz/cgutils.mm to the force Skia/Raster test (i.e. to deny Skia Metal). 

But any reason to not use the OpenGL era denylist [1] for the MacOS skia/Metal as well? 

On MacOS, do we not skia test the Vulkan/Metal for needed driver device signatures and record them to the skia.log in LibreOffice user profile?

=-ref-=
[1] https://opengrok.libreoffice.org/xref/core/vcl/skia/skia_denylist_vulkan.xml
Comment 5 V Stuart Foote 2024-04-09 11:27:32 UTC Comment hidden (off-topic)
Comment 6 Julien Nabet 2024-04-09 11:32:59 UTC
In vcl/skia/skia_denylist_vulkan.xml, line 12 indicates:
os - "all", "7", "8", "8_1", "10", "windows", "linux", "osx_10_5", "osx_10_6", "osx_10_7", "osx_10_8", "osx"

so macOS should be taken into account.

But indeed, we need more info from skia.log to disable precisely these cards (see https://wiki.documentfoundation.org/QA/FirstSteps#Graphics-related_issues_(_Skia_))
Comment 7 V Stuart Foote 2024-04-09 11:57:32 UTC
(In reply to Julien Nabet from comment #6)

dev's choice as always, so assume Patrick had a reason for handling it in the quartz VCL source [1]. Just wanted to bring up possibility that the common deny list handling might be better for mixed sets of macOS GPU devices and CPUs. As can be seen the os listing is a hold over from OpenGL era that Tor and Lubos had adapted.

Could be the macOS builds don't even extract the Skia Vulkan/Metal device and driver strings in a usable format to test via the common deny list.

=-ref-=
[1] https://opengrok.libreoffice.org/xref/core/vcl/quartz/cgutils.mm?r=fe3fa169#109
Comment 8 Patrick (volunteer) 2024-04-09 13:01:37 UTC
(In reply to V Stuart Foote from comment #7)
> dev's choice as always, so assume Patrick had a reason for handling it in
> the quartz VCL source [1]. Just wanted to bring up possibility that the
> common deny list handling might be better for mixed sets of macOS GPU
> devices and CPUs. As can be seen the os listing is a hold over from OpenGL
> era that Tor and Lubos had adapted.

I used a simple list because on macOS, Apple controls the hardware configuration  (i.e. walled garden) so there aren't that many types of GPUs in Macs and the GPUs are matched to a specific type of CPU (Intel or Silicon). For example, Apple only shipped the Radeon GPUs currently in the list in certain models of Intel MacBook Pros. Likewise, Apple only shipped the Intel HD Graphics 6000 GPU (which was integrated into the CPU) in 2015 and 2017 models of Intel MacBook Airs. No issues have been reported so far with any Silicon Mac GPUs.

If we add the Intel HD Graphics 6000 GPU, we are up to a total of 3 GPUs that will be checked with one macOS Metal call.
Comment 9 V Stuart Foote 2024-04-09 13:12:51 UTC
(In reply to V Stuart Foote from comment #7)

> Could be the macOS builds don't even extract the Skia Vulkan/Metal device
> and driver strings in a usable format to test via the common deny list.

In fact the log file for SK_METAL [1] is just "RenderMethod metal" 

And IIUC none of the skia VkPhysicalDeviceProperties 'props' for macOS skia Metal are being captured.  

The SK_METAL would need new assignments similar to the SK_VULKAN [2] to use the OpenGL era common denylist.

=-ref-=
[1] https://opengrok.libreoffice.org/xref/core/vcl/skia/SkiaHelper.cxx?a=true&r=3c8af232&h=201
[2] https://opengrok.libreoffice.org/xref/core/vcl/skia/SkiaHelper.cxx?a=true&r=3c8af232&h=130
Comment 10 V Stuart Foote 2024-04-09 13:42:34 UTC
Looks like the Metal MTLArchitecture [1] class provides similar to Vulkan's VkPhysicalDeviceProperties implementation [2], but not sure it can be shoehorned into a common deny list.

METAL API ≠ Vulkan API and the available details for Metal don't really line up to use the Vulkan centric strings. 

Any inclination toward cross platform maintenance to try to force it by using the Vulkan SDK? Or maybe best to keep to generic Metal on macOS to control our skia rendering.

=-ref-=

[1] https://developer.apple.com/documentation/metal/mtlarchitecture
[2] https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/VkPhysicalDeviceProperties.html
Comment 11 Patrick (volunteer) 2024-04-09 15:36:51 UTC
I don't know where this bug went off the rails. Seems like it got hijacked about adding logging.

So, to set expectations, I am a volunteer and this is just my hobby. I am interested in stopping the crashing so when I get some time, I'll add the Intel HD Graphics 6000 GPU to the existing macOS deny list code. But I have zero time or interest in implementing Skia logging to match Windows and Linux.

@Giuseppe S Can you do you following steps? These steps will print out the exact name that Metal uses for your GPU:

1. Download and install the latest nightly master build from the following URL:

https://dev-builds.libreoffice.org/daily/master/current.html

2. The nightly master builds install in /Applications/LibreOfficeDev.app. These builds are not codesigned like regular LibreOffice releases so you will need to execute the following Terminal command after installation but before you launch /Applications/LibreOfficeDev:

xattr -d com.apple.quarantine /Applications/LibreOfficeDev.app
Finder and then run LibreOfficeDev.app from the Terminal by copying the following line into the Terminal and pressing the return key:

/Applications/LibreOfficeDev.app/Contents/MacOS/soffice

4. In the Terminal, a message similar to the following line will be printed. 

warn:vcl.skia:1225:20083:vcl/quartz/cgutils.mm:105: Default MTLDevice is "Apple M1 Pro"

5. If you don't see the above line, you are using Skia/Raster so change to Skia/Metal and restart. Just before the crash, you should see the above line.

6. Paste message into this bug.
Comment 12 V Stuart Foote 2024-04-09 19:23:57 UTC
@Patrick, sorry if I wasn't clear. Comment 10 was for you, and really just thinking out loud.

Not tasking in any sense. And apologize if it seems a hijacking. You've moved things along for all Skia implementations, not just macOS Metal and we do appreciate it, thanks!

Julien and I have played "wack-a-mole" with Skia rendering issues and the deny listing process to keep up with users with marginally supported hw/driver off of Vulkan rendering. 

So his comment was fair, but neither of us had looked at the macOS side--I did and posted the links with details.

Absent some refactoring, I don't think we can use the same common deny listing as Vulkan rendering.
Comment 13 Giuseppe S. 2024-04-10 06:06:59 UTC
Here is it:

warn:vcl.skia:53286:1498386:vcl/quartz/cgutils.mm:105: Default MTLDevice is "Intel(R) Iris(TM) Graphics 6000"

Bye,
Giuseppe

P.S. There is a blank space where the line wrapped.
Comment 14 Patrick (volunteer) 2024-04-10 12:12:09 UTC
(In reply to Giuseppe S. from comment #13)
> Here is it:
> 
> warn:vcl.skia:53286:1498386:vcl/quartz/cgutils.mm:105: Default MTLDevice is
> "Intel(R) Iris(TM) Graphics 6000"
> 
> Bye,
> Giuseppe
> 
> P.S. There is a blank space where the line wrapped.

Thank you for posting the MTLDevice. I have submitted the following fix that should cause LibreOffice to automatically switch to Skia/Raster mode if you enable Skia/Raster and have this GPU.

I will post again when I know which nightly master build will include the fix so you can test my fix. If my fix passes all automated tests, it should be in tomorrow's (11 April 2024) nightly build.
Comment 15 V Stuart Foote 2024-04-10 12:57:56 UTC
See https://gerrit.libreoffice.org/c/core/+/165927

Thanks!
Comment 16 Commit Notification 2024-04-10 15:59:44 UTC
Patrick Luby committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/fe3a4bdf48f7b2d4f6da31b4392ac5979653cf9c

tdf#160590 Disable Metal with Intel HD Graphics 6000

It will be available in 24.8.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 17 Patrick (volunteer) 2024-04-11 14:58:22 UTC
I have committed a fix this bug. The fix should be in today's (11 April 2024) nightly master builds:

https://dev-builds.libreoffice.org/daily/master/current.html

Note for macOS testers: the nightly master builds install in /Applications/LibreOfficeDev.app. These builds are not codesigned like regular LibreOffice releases so you will need to execute the following Terminal command after installation but before you launch /Applications/LibreOfficeDev:

xattr -d com.apple.quarantine /Applications/LibreOfficeDev.app
Comment 18 Patrick (volunteer) 2024-04-12 00:07:28 UTC
Hi Giuseppe S. Can you confirm that the latest nightly master build no longer crashes? If it works for you, I will submit it for inclusion in the next LibreOffice 24.2 release.

Install the latest nightly master build using the steps in comment #17 and launch LibreOfficeDev.app from the Finder. Check in the LibreOffice Options dialog that you have set the LibreOffice settings to Skia/Metal (first checked, second unchecked). Restart if necessary.

If my fix is working, the LibreOffice About dialog should display Skia/Raster and the Options dialog should still be set to Skia/Metal.
Comment 19 Giuseppe S. 2024-04-13 09:53:00 UTC
Hello,
I just tried nightly build 2024-04-13 (24.8.0.0.alpha0+) and it did
not crash, albeit without any hardware acceleration, Impress is almost
unsuable on this MacBook Air of 2017.

Now with option "Use Skia for all rendering" and without option "Force
Skia software rendering", the Skia log contains:

RenderMethod: raster
Compiler: Clang

Bye,
Giuseppe
Comment 20 Patrick (volunteer) 2024-04-13 14:06:07 UTC
(In reply to Giuseppe S. from comment #19)
> I just tried nightly build 2024-04-13 (24.8.0.0.alpha0+) and it did
> not crash, albeit without any hardware acceleration, Impress is almost
> unsuable on this MacBook Air of 2017.

I wonder if some limited resource (CPU, memory, etc.) that LibreOffice is competing for. Your original crash log indicated that the macOS Metal code was blocked in a low level macOS function.

Are you familiar with the /Applications/Utilities/Activity Monitor application? If yes, can you attach a sample of the LibreOfficeDev application while using Impress? That might give me an idea of what LibreOffice code is running when you see the slowness.
Comment 21 Commit Notification 2024-04-13 19:13:52 UTC
Patrick Luby committed a patch related to this issue.
It has been pushed to "libreoffice-24-2":

https://git.libreoffice.org/core/commit/acb6430800fccd120765110a3822c422fbc9a19d

tdf#160590 Disable Metal with Intel HD Graphics 6000

It will be available in 24.2.4.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 22 Giuseppe S. 2024-04-13 19:46:21 UTC
Created attachment 193665 [details]
Apple Activity Monitor sampling 2024-04-13 LibreOfficeDev, internal LCD
Comment 23 Giuseppe S. 2024-04-13 19:51:44 UTC
Hello,
it seems I found different behaviours when using the internal LCD (smaller) or the external (larger) monitor.
Using the external one, the slides preview fits on the screen, while on the internal one, it is clipped and should be scrolled. Using the external monitor, working on Impress never blocks; using the internal one, it blocks very often with the "loading spinning" pointer I don't know the correct name of this pointer).

The sample I attached is using latest noghtly build on the internal LCD.

Thank you,
Giuseppe
Comment 24 Patrick (volunteer) 2024-04-13 20:47:11 UTC
(In reply to Giuseppe S. from comment #23)
> it seems I found different behaviours when using the internal LCD (smaller)
> or the external (larger) monitor.
> Using the external one, the slides preview fits on the screen, while on the
> internal one, it is clipped and should be scrolled. Using the external
> monitor, working on Impress never blocks; using the internal one, it blocks
> very often with the "loading spinning" pointer I don't know the correct name
> of this pointer).

Thank you for the sample. Your sample does not show any blocking that I can see. But it shows a very large number of drawing operations. I can't see a pattern, but it seems that LibreOffice is drawing all sorts of different parts of your document and then copying the drawn bitmaps to the screen.

Is it possible to obtain another sample when the spinning pointer appears? It would be best if you are able to obtain a sample when the spinning pointer appears for at least a few seconds after you press the Sample button in Activity Monitor.

One more question: does the spinning pointer occur with Skia/Raster and LibreOffice 24.2? I am wondering if a recent change in the LibreOffice code is the cause of what you see with your internal display.
Comment 25 Commit Notification 2024-04-26 12:10:31 UTC
Patrick Luby committed a patch related to this issue.
It has been pushed to "libreoffice-24-2-3":

https://git.libreoffice.org/core/commit/f7579474089bfc67f7d83680d562efa17783e4b4

tdf#160590 Disable Metal with Intel HD Graphics 6000

It will be available in 24.2.3.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.