Bug 158129 - Bluescreen on Windows 11 when LibreOffice is running (with or without open document) OpenCL with iGPU (Intel) and dGPU (nVidia)
Summary: Bluescreen on Windows 11 when LibreOffice is running (with or without open do...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
7.6.2.1 release
Hardware: x86-64 (AMD64) Windows (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: OpenCL
  Show dependency treegraph
 
Reported: 2023-11-09 10:33 UTC by Helmut Steiner
Modified: 2024-03-03 15:56 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
skia.log (37 bytes, text/plain)
2023-11-10 13:12 UTC, Helmut Steiner
Details
opencl_devices.log (5.22 KB, text/plain)
2023-11-15 13:16 UTC, Helmut Steiner
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Helmut Steiner 2023-11-09 10:33:51 UTC
Description:
LibreOffice regularly produces a Bluescreen on Dell XPS 15 9530
Windows 11 Pro Version 10.0.22631 Build 22631

I get the bluescreen several times a day only when LibreOffice is open. No particular steps are needed to reproduce the issue. Only happens on this laptop not on other laptops with Windows 11 Pro Version 10.0.22631 Build 22631.

The bluescreen appears with or without opened documents even in the base LibreOffice application.

Steps to Reproduce:
1. Open LibreOffice and work as usual
2. LibreOffice will freeze for a minute or more
3. Other programs freeze as well
4. BlueScreen appears and Laptop restarts

Actual Results:
Crashes Windows

Expected Results:
Runs without crashing Windows


Reproducible: Sometimes


User Profile Reset: No

Additional Info:
Version: 7.6.2.1 (X86_64) / LibreOffice Community
Build ID: 56f7684011345957bbf33a7ee678afaf4d2ba333
CPU threads: 20; OS: Windows 10.0 Build 22631; UI render: Skia/Raster; VCL: win
Locale: de-AT (de_AT); UI: en-US
Calc: CL threaded
Comment 1 V Stuart Foote 2023-11-09 13:29:12 UTC
Sorry for your issues.

Notice you've fallen back to raster mode so could be a graphics driver issue, is the system stable if you force it to not use Skia/Vulkan and use just the Skia raster rendering?

Tools -> Options -> View and checkbox select *both* Skia boxes. Alternatively, you can directly edit the stanzas in the user profile at %APPDATA\LibreOffice\4\user\registrymodifications.xcu where the two stanzas ("UseSkia" and "ForceSkiaRaster") would be set 'true'.

<item oor:path="/org.openoffice.Office.Common/VCL"><prop oor:name="ForceSkiaRaster" oor:op="fuse"><value>true</value></prop></item>
<item oor:path="/org.openoffice.Office.Common/VCL"><prop oor:name="UseSkia" oor:op="fuse"><value>true</value></prop></item>

If blocking Vulkan Skia mode helps, we would need the GPU and display size (HD or 4K) details for the errant laptop. The GPU details would be in the LO %APPDATA\LibreOffice\4\cache skia.log when Vulkan has been enabled.  The display details can be copied from a run of 'msinfo32' on the Components -> Display panel.

If blocking helps we might deny list the driver/hw/os to avoid.
Comment 2 Helmut Steiner 2023-11-09 13:55:33 UTC
Both Skia checkboxes were already selected but the line
<item oor:path="/org.openoffice.Office.Common/VCL"><prop oor:name="UseSkia" oor:op="fuse"><value>true</value></prop></item> 
was missing in the file. I added it for now and observe if it crashes again.
Comment 3 QA Administrators 2023-11-10 03:14:18 UTC Comment hidden (obsolete)
Comment 4 Helmut Steiner 2023-11-10 10:22:41 UTC
Even after inserting the missing line I get bluescreen crashes.
Comment 5 V Stuart Foote 2023-11-10 12:56:22 UTC
OK, now set both stanzas to 'false'

That will disable Skia rendering and fallback to GDI+

The "UseSkia" toggle does as named and defaults to Vulkan/Metal and X11/gen accelerated GPU libs, while the "ForceSkiaRaster" still uses Skia libs but with raster framing only--non-accelerated.

true-false is least stable Skia but fully GPU accelerated
true-true is "more" stable Skia.
false-false disables Skia

And setting both false forces the fallback, where your GPU will use GDI+ calls only. Win11 WDM may choke there as well.

In the UI there is a "Use hardware acceleration" checkbox that will go active when the UseSkia and ForceSkiaRaster values are both set false.  IIUC also disabling that checkbox pushes some 3D and some 2D rendering/anti-aliasing onto the CPU. Probably not the issue, for now just want to eliminate the Skia support as an issue with Win11 bluescreen.

Also, still need the skia.log and display details. And is there a particular module of LO (i.e. draw, impress, calc) that seems to trigger issue consistently?  If not a skia issue, then we'll need steps-to-reproduce.

A bit of a chore to set up, but you could setup to capture a backtrace (Win11 SDK with symbols from LO [1]) which may show events from LibreOffice before the os BlueScreen.

=-ref-=
[1] https://wiki.documentfoundation.org/How_to_get_a_backtrace_with_WinDbg
Comment 6 Helmut Steiner 2023-11-10 13:12:18 UTC
Created attachment 190783 [details]
skia.log

Skia.log only contains:
RenderMethod: raster
Compiler: Clang
Comment 7 Helmut Steiner 2023-11-10 13:14:12 UTC
Display information:
Name	NVIDIA GeForce RTX 4070 Laptop GPU
PNP Device ID	PCI\VEN_10DE&DEV_2820&SUBSYS_0BEB1028&REV_A1\4&3A6078BA&0&0008
Adapter Type	NVIDIA GeForce RTX 4070 Laptop GPU, NVIDIA compatible
Adapter Description	NVIDIA GeForce RTX 4070 Laptop GPU
Adapter RAM	(1 048 576) bytes
Installed Drivers	C:\WINDOWS\System32\DriverStore\FileRepository\nvdmsi.inf_amd64_1e4e5c0fbdb0a298\nvldumdx.dll,C:\WINDOWS\System32\DriverStore\FileRepository\nvdmsi.inf_amd64_1e4e5c0fbdb0a298\nvldumdx.dll,C:\WINDOWS\System32\DriverStore\FileRepository\nvdmsi.inf_amd64_1e4e5c0fbdb0a298\nvldumdx.dll,C:\WINDOWS\System32\DriverStore\FileRepository\nvdmsi.inf_amd64_1e4e5c0fbdb0a298\nvldumdx.dll
Driver Version	31.0.15.4601
INF File	oem224.inf (Section475 section)
Color Planes	Not Available
Color Table Entries	Not Available
Resolution	Not Available
Bits/Pixel	Not Available
Memory Address	0xBE000000-0xBEFFFFFF
Memory Address	0x0000-0x1FFFFFF
IRQ Channel	IRQ 4294967254
Driver	C:\WINDOWS\SYSTEM32\DRIVERSTORE\FILEREPOSITORY\NVDMSI.INF_AMD64_1E4E5C0FBDB0A298\NVLDDMKM.SYS (31.0.15.4601, 56,00 MB (58 720 904 bytes), 02.11.2023 23:04)
	
Name	Intel(R) Iris(R) Xe Graphics
PNP Device ID	PCI\VEN_8086&DEV_A7A0&SUBSYS_0BEB1028&REV_04\3&11583659&0&10
Adapter Type	Intel(R) Iris(R) Xe Graphics Family, Intel Corporation compatible
Adapter Description	Intel(R) Iris(R) Xe Graphics
Adapter RAM	1,00 GB (1 073 741 824 bytes)
Installed Drivers	<>,C:\WINDOWS\System32\DriverStore\FileRepository\iigd_dch.inf_amd64_3d7ca808e461ff77\igd10iumd64.dll,C:\WINDOWS\System32\DriverStore\FileRepository\iigd_dch.inf_amd64_3d7ca808e461ff77\igd10iumd64.dll,C:\WINDOWS\System32\DriverStore\FileRepository\iigd_dch.inf_amd64_3d7ca808e461ff77\igd12umd64.dll
Driver Version	31.0.101.4575
INF File	oem101.inf (iRPLPD_w10_DS section)
Color Planes	Not Available
Color Table Entries	4294967296
Resolution	3456 x 2160 x 60 hertz
Bits/Pixel	32
Memory Address	0x88000000-0x88FFFFFF
Memory Address	0x0000-0xFFFFFFF
I/O Port	0x00004000-0x0000403F
IRQ Channel	IRQ 4294967255
Driver	C:\WINDOWS\SYSTEM32\DRIVERSTORE\FILEREPOSITORY\IIGD_DCH.INF_AMD64_3D7CA808E461FF77\IGDKMDN64.SYS (31.0.101.4575, 48,29 MB (50 632 608 bytes), 26.10.2023 14:02)
Comment 8 Helmut Steiner 2023-11-10 13:18:23 UTC
The issue even happened once with just the basic LibreOffice GUI open (no module/no document loaded). For now I am also trying a complete reinstall of LibreOffice.
Comment 9 QA Administrators 2023-11-11 03:14:24 UTC Comment hidden (obsolete)
Comment 10 V Stuart Foote 2023-11-12 13:24:45 UTC
With your laptop's Intel iGPU and nVidia dGPU mix, if both GPUs are enabled perhaps rather than Skia the issue is with OpenCL, similar to see also bug 154533.

Disable OpenCL in LibreOffice from Tools -> Options -> OpenCL and uncheck "Allow use of OpenCL" and see if that becomes stable.
Comment 11 Helmut Steiner 2023-11-13 10:06:04 UTC
I changed it and will report if it helped!
Comment 12 Helmut Steiner 2023-11-15 08:38:09 UTC
It did help - at least there is no bluescreen anymore. Now just LibreOffice crashes from time to time but I can live with that.
Comment 13 V Stuart Foote 2023-11-15 12:27:35 UTC
An OpenCL issue against multiple GPUs present (unclear if it is an nVidia Optimus for assigning OpenCL calls to dGPU vs iGPU or something in LO's OpenCL calls).

Simplest for users affected is to just set OpenCL disabled.

Some steps for more info using Win11 SDK and attaching to WinDbg with symbols in 
see also bug 154533 which this looks to be a duplicate. 

Setting NEW
Comment 14 V Stuart Foote 2023-11-15 12:45:52 UTC
Also, if you could check the Tools -> Options -> OpenCL "Allow use of OpenCL" enabled again, and restart LO (hopefully without Bluescreen or LO crash) would you post the text from LO's "opencl_devices.log" found in:

C:\Users\<username>\AppData\Roaming\LibreOffice\4\Cache

We can use those details to force disable the OpenCL device\driver\os via deny listing.
Comment 15 Helmut Steiner 2023-11-15 13:16:00 UTC
Created attachment 190839 [details]
opencl_devices.log

I reenabled OpenCL restarted LibreOffice and literally a few seconds later everything hung + bluescreen. So yeah, I guess that confirms the OpenCL issue.
Comment 16 V Stuart Foote 2023-11-15 14:15:57 UTC
(In reply to me from comment #15)
> Created attachment 190839 [details]
> opencl_devices.log
> 
> I reenabled OpenCL restarted LibreOffice and literally a few seconds later
> everything hung + bluescreen. So yeah, I guess that confirms the OpenCL
> issue.

Thanks. Last ask, go ahead and enable the "Use Skia for all rendering" from Tools -> Options -> View (leave the Force Skia software/raster unchecked or false if editing the profile directly).  

Then post the skia.log for your system, that will show the GPU (either the iGPU or dGPU) actually getting picked up. The log will give the specific device IDs for either. Ideally we'd like details from the skia.log for both GPUs.

With OpenCL off, you are probably safe to use the Skia accelerated rendering (3D or raster).
Comment 17 Helmut Steiner 2023-11-15 15:28:59 UTC
skia.log:

RenderMethod: vulkan
Vendor: 0x8086
Device: 0xa7a0
API: 1.3.250
Driver: 0.405.479
DeviceType: integrated
DeviceName: Intel(R) Iris(R) Xe Graphics
Denylisted: no
Comment 18 Helmut Steiner 2023-11-17 10:38:17 UTC
Can I send you the LibreOffice crash reports somehow?
As I said, I don't get the bluescreen anymore, instead LibreOffice just crashes. I guess it will be helpful for debugging the error.
Comment 19 Helmut Steiner 2023-11-17 11:56:06 UTC
dump.ini in the LibreOffice folder doesn't really show helpful information. 
This is the record from Windows Event Viewer:

Log Name:      Application
Source:        Application Error
Date:          17.11.2023 11:36:29
Event ID:      1000
Task Category: Application Crashing Events
Level:         Error
Keywords:      
User:          HSPLT03\Helmu
Computer:      HSPLT03
Description:
Faulting application name: soffice.bin, version: 7.6.2.1, time stamp: 0x65114a74
Faulting module name: ucrtbase.dll, version: 10.0.22621.2506, time stamp: 0x097c794c
Exception code: 0xc0000409
Fault offset: 0x000000000007f61e
Faulting process id: 0x0x5220
Faulting application start time: 0x0x1DA194079F6D6C2
Faulting application path: C:\Program Files\LibreOffice\program\soffice.bin
Faulting module path: C:\WINDOWS\System32\ucrtbase.dll
Report Id: 1be1bb8c-f261-47fa-b9ab-db9e74f64a5d
Faulting package full name: 
Faulting package-relative application ID: 
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="Application Error" Guid="{a0e9b465-b939-57d7-b27d-95d8e925ff57}" />
    <EventID>1000</EventID>
    <Version>0</Version>
    <Level>2</Level>
    <Task>100</Task>
    <Opcode>0</Opcode>
    <Keywords>0x8000000000000000</Keywords>
    <TimeCreated SystemTime="2023-11-17T10:36:29.2273903Z" />
    <EventRecordID>21737</EventRecordID>
    <Correlation />
    <Execution ProcessID="29640" ThreadID="1488" />
    <Channel>Application</Channel>
    <Computer>HSPLT03</Computer>
    <Security UserID="S-1-5-21-72274046-858534750-2259427406-1001" />
  </System>
  <EventData>
    <Data Name="AppName">soffice.bin</Data>
    <Data Name="AppVersion">7.6.2.1</Data>
    <Data Name="AppTimeStamp">65114a74</Data>
    <Data Name="ModuleName">ucrtbase.dll</Data>
    <Data Name="ModuleVersion">10.0.22621.2506</Data>
    <Data Name="ModuleTimeStamp">097c794c</Data>
    <Data Name="ExceptionCode">c0000409</Data>
    <Data Name="FaultingOffset">000000000007f61e</Data>
    <Data Name="ProcessId">0x5220</Data>
    <Data Name="ProcessCreationTime">0x1da194079f6d6c2</Data>
    <Data Name="AppPath">C:\Program Files\LibreOffice\program\soffice.bin</Data>
    <Data Name="ModulePath">C:\WINDOWS\System32\ucrtbase.dll</Data>
    <Data Name="IntegratorReportId">1be1bb8c-f261-47fa-b9ab-db9e74f64a5d</Data>
    <Data Name="PackageFullName">
    </Data>
    <Data Name="PackageRelativeAppId">
    </Data>
  </EventData>
</Event>
Comment 20 V Stuart Foote 2023-11-17 12:45:20 UTC
unfortunately, as you note, that dump only shows the fault at System32\ucrtbase.dll for the soffice.bin process.

To tease out more of the issue (still probably an errant nVidia or Intel call with the iGPU in use) you would need to install the Debugging Tools for Windows debugger from the Win11 SDK / WDK [1] or the newer standalone packaging [2].

That coupled with a pull of LibreOffice build symbols [3] allows you to attach to the soffice.bin process while running and when it crashes capture the stack trace. Having the symbols available locally and linked (they take some time to download to your system) will isolate the faulting calls. They could be in the LibreOffice source, of the nVidia driver source, or the Intel driver source--but we'd be able to better see what is happening.

=-ref-=
[1] https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/debugger-download-tools
[2] https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/ 
[3] set the symbol path to "CACHE*C:\symbols64;SRV*https://dev-downloads.libreoffice.org/symstore/symbols;SRV*https://msdl.microsoft.com/download/symbols"
Comment 21 V Stuart Foote 2023-11-17 13:11:29 UTC
With windbg opened and the soffice.bin attached and running, issue a windbg console command to load the symbols ".reload /f" (it will take a while to download).

Then continue to run soffice.bin by issuing the "g" command. On crash/lockup you'd issue a "~* kp" command in the windbg console (any missing symbols will be added) and then the thread stack will dump to the windbg console.

The first thread segment of the listing will hold the faulting issue. Look it over and see if anything jumps out. You can run an "!analyze -v" to from windbg console for automated "feedback" on the event.

If not obvious what the fault is, save the windbg console content for both commands to a text file and then attach. The stack trace for the threads and the analyze output will include the symbol details (identify the line of the calling source, if the symbols are available, the Intel and nVidia symbols for their drivers won't be) and we'd know more precisely the issue.
Comment 22 Helmut Steiner 2023-11-20 13:57:34 UTC
I set up WinDbg and was working the whole day with debugging activated but so far no crash. I will post again if something shows up. 

One thing that is visible is that I get hundreds of C++ errors (always the same one):
(15f0.570): C++ EH exception - code e06d7363 (first chance)
Comment 23 V Stuart Foote 2023-11-20 15:57:33 UTC
(In reply to me from comment #22)
> I set up WinDbg and was working the whole day with debugging activated but
> so far no crash. I will post again if something shows up. 
> 
> One thing that is visible is that I get hundreds of C++ errors (always the
> same one):
> (15f0.570): C++ EH exception - code e06d7363 (first chance)

Good!

Since it is a "first chance" and doesn't crash "second chance", likely the error is resolved under debug internal to the module throwing it (could be LO, could be os, could be driver). Annoying but you'd have to really dig in with windbg to tease out a recurring first chance issue(s). Far more than we're asking...

Just keep a lookout for the sporadic crash. You could even reenable the OpenCL to see if it is nVidia call or an Intel call for OpenCL hw that is faulting and post those details.
Comment 24 Helmut Steiner 2023-11-28 12:46:24 UTC
The issue might have been fixed by a driver update. I didn't have any crash in the last days even without the debugger activated. I will reactivate the OpenCL option just to see if the bluescreen is gone, too.