CVE-2024-47744

Use a dedicated mutex to guard kvmusagecount to fix a potential deadlock on x86 due to a chain of locks and SRCU synchronizations. Translating the below lockdep splat, CPU1 #6 will wait on CPU0 #1, CPU0 #8 will wait on CPU2 #3, and CPU2 #7 will wait on CPU1 #4 (if there's a writer, due to the fairness of r/w semaphores).

CPU0                     CPU1                     CPU2

1 lock(&kvm->slotslock); 2 lock(&vcpu->mutex); 3 lock(&kvm->srcu); 4 lock(cpuhotpluglock); 5 lock(kvmlock); 6 lock(&kvm->slotslock); 7 lock(cpuhotplug_lock); 8 sync(&kvm->srcu);

Note, there are likely more potential deadlocks in KVM x86, e.g. the same pattern of taking cpuhotpluglock outside of kvmlock likely exists with _kvmclockcpufreqnotifier():

But, actually triggering such deadlocks is beyond rare due to the combination of dependencies and timings involved. E.g. the cpufreq notifier is only used on older CPUs without a constant TSC, mucking with the NX hugepage mitigation while VMs are running is very uncommon, and doing so while also onlining/offlining a CPU (necessary to generate contention on cpuhotpluglock) would be even more unusual.

The most robust solution to the general cpuhotpluglock issue is likely to switch vmlist to be an RCU-protected list, e.g. so that x86's cpufreq notifier doesn't to take kvmlock. For now, settle for fixing the most blatant deadlock, as switching to an RCU-protected list is a much more involved change, but add a comment in locking.rst to call out that care needs to be taken when walking holding kvmlock and walking vmlist.

====================================================== WARNING: possible circular locking dependency detected 6.10.0-smp--c257535a0c9d-pip #330 Tainted: G S O

tee/35048 is trying to acquire lock: ff6a80eced71e0a8 (&kvm->slotslock){+.+.}-{3:3}, at: setnxhugepages+0x179/0x1e0 [kvm]

but task is already holding lock: ffffffffc07abb08 (kvmlock){+.+.}-{3:3}, at: setnxhugepages+0x14a/0x1e0 [kvm]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #3 (kvmlock){+.+.}-{3:3}: _mutexlock+0x6a/0xb40 mutexlocknested+0x1f/0x30 kvmdevioctl+0x4fb/0xe50 [kvm] _sesysioctl+0x7b/0xd0 _x64sysioctl+0x21/0x30 x64syscall+0x15d0/0x2e60 dosyscall64+0x83/0x160 entrySYSCALL64after_hwframe+0x76/0x7e

-> #2 (cpuhotpluglock){++++}-{0:0}: cpusreadlock+0x2e/0xb0 statickeyslowinc+0x16/0x30 kvmlapicsetbase+0x6a/0x1c0 [kvm] kvmsetapicbase+0x8f/0xe0 [kvm] kvmsetmsrcommon+0x9ae/0xf80 [kvm] vmxsetmsr+0xa54/0xbe0 [kvmintel] _kvmsetmsr+0xb6/0x1a0 [kvm] kvmarchvcpuioctl+0xeca/0x10c0 [kvm] kvmvcpuioctl+0x485/0x5b0 [kvm] _sesysioctl+0x7b/0xd0 _x64sysioctl+0x21/0x30 x64syscall+0x15d0/0x2e60 dosyscall64+0x83/0x160 entrySYSCALL64after_hwframe+0x76/0x7e

-> #1 (&kvm->srcu){.+.+}-{0:0}: _synchronizesrcu+0x44/0x1a0

---truncated---

References

Affected packages

Debian:13 / linux

Package

Name: linux
Purl: pkg:deb/debian/linux?arch=source

Affected ranges

Type: ECOSYSTEM
Events: Introduced

0Unknown introduced version / All previous versions are affected

Fixed

6.11.2-1

Affected versions

6.*

6.1.27-1

6.1.37-1

6.1.38-1

6.1.38-2~bpo11+1

6.1.38-2

6.1.38-3

6.1.38-4~bpo11+1

6.1.38-4

6.1.52-1

6.1.55-1~bpo11+1

6.1.55-1

6.1.64-1

6.1.66-1

6.1.67-1

6.1.69-1~bpo11+1

6.1.69-1

6.1.76-1~bpo11+1

6.1.76-1

6.1.82-1

6.1.85-1

6.1.90-1~bpo11+1

6.1.90-1

6.1.94-1~bpo11+1

6.1.94-1

6.1.98-1

6.1.99-1

6.1.106-1

6.1.106-2

6.1.106-3

6.1.112-1

6.3.1-1~exp1

6.3.2-1~exp1

6.3.4-1~exp1

6.3.5-1~exp1

6.3.7-1~bpo12+1

6.3.7-1

6.3.11-1

6.4~rc6-1~exp1

6.4~rc7-1~exp1

6.4.1-1~exp1

6.4.4-1~bpo12+1

6.4.4-1

6.4.4-2

6.4.4-3~bpo12+1

6.4.4-3

6.4.11-1

6.4.13-1

6.5~rc4-1~exp1

6.5~rc6-1~exp1

6.5~rc7-1~exp1

6.5.1-1~exp1

6.5.3-1~bpo12+1

6.5.3-1

6.5.6-1

6.5.8-1

6.5.10-1~bpo12+1

6.5.10-1

6.5.13-1

6.6.3-1~exp1

6.6.4-1~exp1

6.6.7-1~exp1

6.6.8-1

6.6.9-1

6.6.11-1

6.6.13-1~bpo12+1

6.6.13-1

6.6.15-1

6.6.15-2

6.7-1~exp1

6.7.1-1~exp1

6.7.4-1~exp1

6.7.7-1

6.7.9-1

6.7.9-2

6.7.12-1~bpo12+1

6.7.12-1

6.8.9-1

6.8.11-1

6.8.12-1~bpo12+1

6.8.12-1

6.9.2-1~exp1

6.9.7-1~bpo12+1

6.9.7-1

6.9.8-1

6.9.9-1

6.9.10-1~bpo12+1

6.9.10-1

6.9.11-1

6.9.12-1

6.10-1~exp1

6.10.1-1~exp1

6.10.3-1

6.10.4-1

6.10.6-1~bpo12+1

6.10.6-1

6.10.7-1

6.10.9-1

6.10.11-1~bpo12+1

6.10.11-1

6.10.12-1

6.11~rc4-1~exp1

6.11~rc5-1~exp1

6.11-1~exp1

Ecosystem specific

{
    "urgency": "not yet assigned"
}