CVE-2023-52738

See a problem?
Source
https://nvd.nist.gov/vuln/detail/CVE-2023-52738
Import Source
https://storage.googleapis.com/cve-osv-conversion/osv-output/CVE-2023-52738.json
JSON Data
https://api.osv.dev/v1/vulns/CVE-2023-52738
Related
Published
2024-05-21T16:15:13Z
Modified
2024-09-18T01:00:20Z
Summary
[none]
Details

In the Linux kernel, the following vulnerability has been resolved:

drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/fini

Currently amdgpu calls drmschedfini() from the fence driver sw fini routine - such function is expected to be called only after the respective init function - drmschedinit() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck recently, and the function drmschedfini() was called even without its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110 BUG: kernel NULL pointer dereference, address: 0000000000000090 PGD 0 P4D 0 Oops: 0002 [#1] PREEMPT SMP NOPTI CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli #338 Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022 RIP: 0010:drmschedfini+0x84/0xa0 [gpusched] [...] Call Trace: <TASK> amdgpufencedriverswfini+0xc8/0xd0 [amdgpu] amdgpudevicefinisw+0x2b/0x3b0 [amdgpu] amdgpudriverreleasekms+0x16/0x30 [amdgpu] devmdrmdevinit_release+0x49/0x70 [...]

To prevent that, check if the drm_sched was properly initialized for a given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest thing on drmschedinit(). But amdgpu seems to "override" the meaning of such field - in the above oops for example, it was a GFX ring causing the crash, and the sched.ready field was set to true in the ring init routine, regardless of the state of the DRM scheduler. Hence, we ended-up using sched.ops as per Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/ [1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

References

Affected packages

Debian:12 / linux

Package

Name
linux
Purl
pkg:deb/debian/linux?arch=source

Affected ranges

Type
ECOSYSTEM
Events
Introduced
0Unknown introduced version / All previous versions are affected
Fixed
6.1.12-1

Ecosystem specific

{
    "urgency": "not yet assigned"
}

Debian:13 / linux

Package

Name
linux
Purl
pkg:deb/debian/linux?arch=source

Affected ranges

Type
ECOSYSTEM
Events
Introduced
0Unknown introduced version / All previous versions are affected
Fixed
6.1.12-1

Ecosystem specific

{
    "urgency": "not yet assigned"
}