In the Linux kernel, the following vulnerability has been resolved:
drm/i915/gt: Fix potential UAF by revoke of fence registers
CI has been sporadically reporting the following issue triggered by igt@i915_selftest@live@hangcheck on ADL-P and similar machines:
<6> [414.049203] i915: Running intelhangcheckliveselftests/igtresetevictfence ... <6> [414.068804] i915 0000:00:02.0: [drm] GT0: GUC: submission enabled <6> [414.068812] i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled <3> [414.070354] Unable to pin Y-tiled fence; err:-4 <3> [414.071282] i915vmarevokefence:301 GEMBUGON(!i915activeisidle(&fence->active)) ... <4>[ 609.603992] ------------[ cut here ]------------ <2>[ 609.603995] kernel BUG at drivers/gpu/drm/i915/gt/intelggttfencing.c:301! <4>[ 609.604003] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI <4>[ 609.604006] CPU: 0 PID: 268 Comm: kworker/u64:3 Tainted: G U W 6.9.0-CIDRM14785-g1ba62f8cea9c+ #1 <4>[ 609.604008] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR4 RVP, BIOS RPLPFWI1.R00.4035.A00.2301200723 01/20/2023 <4>[ 609.604010] Workqueue: i915 _i915gemfreework [i915] <4>[ 609.604149] RIP: 0010:i915vmarevokefence+0x187/0x1f0 [i915] ... <4>[ 609.604271] Call Trace: <4>[ 609.604273] <TASK> ... <4>[ 609.604716] _i915vmaevict+0x2e9/0x550 [i915] <4>[ 609.604852] _i915vmaunbind+0x7c/0x160 [i915] <4>[ 609.604977] forceunbind+0x24/0xa0 [i915] <4>[ 609.605098] i915vmadestroy+0x2f/0xa0 [i915] <4>[ 609.605210] _i915gemobjectpagesfini+0x51/0x2f0 [i915] <4>[ 609.605330] _i915gemfreeobjects.isra.0+0x6a/0xc0 [i915] <4>[ 609.605440] processscheduled_works+0x351/0x690 ...
In the past, there were similar failures reported by CI from other IGT tests, observed on other platforms.
Before commit 63baf4f3d587 ("drm/i915/gt: Only wait for GPU activity before unbinding a GGTT fence"), i915vmarevokefence() was waiting for idleness of vma->active via fenceupdate(). That commit introduced vma->fence->active in order for the fenceupdate() to be able to wait selectively on that one instead of vma->active since only idleness of fence registers was needed. But then, another commit 0d86ee35097a ("drm/i915/gt: Make fence revocation unequivocal") replaced the call to fenceupdate() in i915vmarevokefence() with only fencewrite(), and also added that GEMBUGON(!i915activeis_idle(&fence->active)) in front. No justification was provided on why we might then expect idleness of vma->fence->active without first waiting on it.
The issue can be potentially caused by a race among revocation of fence registers on one side and sequential execution of signal callbacks invoked on completion of a request that was using them on the other, still processed in parallel to revocation of those fence registers. Fix it by waiting for idleness of vma->fence->active in i915vmarevoke_fence().
(cherry picked from commit 24bb052d3dd499c5956abad5f7d8e4fd07da7fb1)