In the Linux kernel, the following vulnerability has been resolved:
drm/xe: Process deferred GGTT node removals on device unwind
While we are indirectly draining our dedicated workqueue ggtt->wq that we use to complete asynchronous removal of some GGTT nodes, this happends as part of the managed-drm unwinding (ggttfiniearly), which could be later then manage-device unwinding, where we could already unmap our MMIO/GMS mapping (mmio_fini).
This was recently observed during unsuccessful VF initialization:
[ ] xe 0000:00:02.1: probe with driver xe failed with error -62 [ ] xe 0000:00:02.1: DEVRES REL ffff88811e747340 _xebounpinmapnovm (16 bytes) [ ] xe 0000:00:02.1: DEVRES REL ffff88811e747540 _xebounpinmapnovm (16 bytes) [ ] xe 0000:00:02.1: DEVRES REL ffff88811e747240 _xebounpinmapnovm (16 bytes) [ ] xe 0000:00:02.1: DEVRES REL ffff88811e747040 tilesfini (16 bytes) [ ] xe 0000:00:02.1: DEVRES REL ffff88811e746840 mmiofini (16 bytes) [ ] xe 0000:00:02.1: DEVRES REL ffff88811e747f40 xebopinnedfini (16 bytes) [ ] xe 0000:00:02.1: DEVRES REL ffff88811e746b40 devmdrmdevinitrelease (16 bytes) [ ] xe 0000:00:02.1: [drm:drmmanagedrelease] drmres release begin [ ] xe 0000:00:02.1: [drm:drmmanagedrelease] REL ffff88810ef81640 _finirelay (8 bytes) [ ] xe 0000:00:02.1: [drm:drmmanagedrelease] REL ffff88810ef80d40 gucctfini (8 bytes) [ ] xe 0000:00:02.1: [drm:drmmanagedrelease] REL ffff88810ef80040 _drmmmutexrelease (8 bytes) [ ] xe 0000:00:02.1: [drm:drmmanagedrelease] REL ffff88810ef80140 ggttfiniearly (8 bytes)
and this was leading to:
[ ] BUG: unable to handle page fault for address: ffffc900058162a0 [ ] #PF: supervisor write access in kernel mode [ ] #PF: errorcode(0x0002) - not-present page [ ] Oops: Oops: 0002 [#1] SMP NOPTI [ ] Tainted: [W]=WARN [ ] Workqueue: xe-ggtt-wq ggttnoderemoveworkfunc [xe] [ ] RIP: 0010:xeggttsetpte+0x6d/0x350 [xe] [ ] Call Trace: [ ] <TASK> [ ] xeggttclear+0xb0/0x270 [xe] [ ] ggttnoderemove+0xbb/0x120 [xe] [ ] ggttnoderemoveworkfunc+0x30/0x50 [xe] [ ] processonework+0x22b/0x6f0 [ ] worker_thread+0x1e8/0x3d
Add managed-device action that will explicitly drain the workqueue with all pending node removals prior to releasing MMIO/GSM mapping.
(cherry picked from commit 89d2835c3680ab1938e22ad81b1c9f8c686bd391)