In the Linux kernel, the following vulnerability has been resolved:
drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/fini
Currently amdgpu calls drmschedfini() from the fence driver sw fini routine - such function is expected to be called only after the respective init function - drmschedinit() - was executed successfully.
Happens that we faced a driver probe failure in the Steam Deck recently, and the function drmschedfini() was called even without its counter-part had been previously called, causing the following oops:
amdgpu: probe of 0000:04:00.0 failed with error -110 BUG: kernel NULL pointer dereference, address: 0000000000000090 PGD 0 P4D 0 Oops: 0002 [#1] PREEMPT SMP NOPTI CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli #338 Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022 RIP: 0010:drmschedfini+0x84/0xa0 [gpusched] [...] Call Trace: <TASK> amdgpufencedriverswfini+0xc8/0xd0 [amdgpu] amdgpudevicefinisw+0x2b/0x3b0 [amdgpu] amdgpudriverreleasekms+0x16/0x30 [amdgpu] devmdrmdevinit_release+0x49/0x70 [...]
To prevent that, check if the drm_sched was properly initialized for a given ring before calling its fini counter-part.
Notice ideally we'd use sched.ready for that; such field is set as the latest thing on drmschedinit(). But amdgpu seems to "override" the meaning of such field - in the above oops for example, it was a GFX ring causing the crash, and the sched.ready field was set to true in the ring init routine, regardless of the state of the DRM scheduler. Hence, we ended-up using sched.ops as per Christian's suggestion [0], and also removed the no_scheduler check [1].
[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/ [1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/
[
{
"source": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git@5ad7bbf3dba5c4a684338df1f285080f2588b535",
"target": {
"file": "drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c"
},
"deprecated": false,
"signature_version": "v1",
"id": "CVE-2023-52738-1b97ef0b",
"signature_type": "Line",
"digest": {
"threshold": 0.9,
"line_hashes": [
"304629186821769328840082270164705790904",
"169977614683984664671704687142352523556",
"147070659661515898921646523233485734237",
"64092004118993646570805203185082007780"
]
}
},
{
"source": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git@2e557c8ca2c585bdef591b8503ba83b85f5d0afd",
"target": {
"function": "amdgpu_fence_driver_sw_fini",
"file": "drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c"
},
"deprecated": false,
"signature_version": "v1",
"id": "CVE-2023-52738-1fa6f234",
"signature_type": "Function",
"digest": {
"length": 510.0,
"function_hash": "23834765038686484915450638732949098222"
}
},
{
"source": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git@2bcbbef9cace772f5b7128b11401c515982de34b",
"target": {
"file": "drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c"
},
"deprecated": false,
"signature_version": "v1",
"id": "CVE-2023-52738-3dba2e89",
"signature_type": "Line",
"digest": {
"threshold": 0.9,
"line_hashes": [
"304629186821769328840082270164705790904",
"169977614683984664671704687142352523556",
"147070659661515898921646523233485734237",
"64092004118993646570805203185082007780"
]
}
},
{
"source": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git@2e557c8ca2c585bdef591b8503ba83b85f5d0afd",
"target": {
"file": "drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c"
},
"deprecated": false,
"signature_version": "v1",
"id": "CVE-2023-52738-5ccb84eb",
"signature_type": "Line",
"digest": {
"threshold": 0.9,
"line_hashes": [
"304629186821769328840082270164705790904",
"169977614683984664671704687142352523556",
"147070659661515898921646523233485734237",
"64092004118993646570805203185082007780"
]
}
},
{
"source": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git@5ad7bbf3dba5c4a684338df1f285080f2588b535",
"target": {
"function": "amdgpu_fence_driver_sw_fini",
"file": "drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c"
},
"deprecated": false,
"signature_version": "v1",
"id": "CVE-2023-52738-6dd092db",
"signature_type": "Function",
"digest": {
"length": 510.0,
"function_hash": "23834765038686484915450638732949098222"
}
},
{
"source": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git@2bcbbef9cace772f5b7128b11401c515982de34b",
"target": {
"function": "amdgpu_fence_driver_sw_fini",
"file": "drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c"
},
"deprecated": false,
"signature_version": "v1",
"id": "CVE-2023-52738-edefcfb3",
"signature_type": "Function",
"digest": {
"length": 510.0,
"function_hash": "23834765038686484915450638732949098222"
}
}
]