In the Linux kernel, the following vulnerability has been resolved:
driver core: fix potential deadlock in _driverattach
In _driverattach function, There are also AA deadlock problem, like the commit b232b02bf3c2 ("driver core: fix deadlock in _deviceattach").
stack like commit b232b02bf3c2 ("driver core: fix deadlock in deviceattach"). list below: In _driverattach function, The lock holding logic is as follows: ... _driverattach if (driverallowsasyncprobing(drv)) devicelock(dev) // get lock dev asyncscheduledev(driverattachasynchelper, dev); // func asyncschedulenode asyncschedulenodedomain(func) entry = kzalloc(sizeof(struct asyncentry), GFPATOMIC); /* when fail or work limit, sync to execute func, but _driverattachasynchelper will get lock dev as will, which will lead to A-A deadlock. */ if (!entry || atomicread(&entrycount) > MAXWORK) { func; else queueworknode(node, systemunboundwq, &entry->work) device_unlock(dev)
As above show, when it is allowed to do async probes, because of
out of memory or work limit, async work is not be allowed, to do
sync execute instead. it will lead to A-A deadlock because of
__driver_attach_async_helper getting lock dev.
Reproduce: and it can be reproduce by make the condition (if (!entry || atomicread(&entrycount) > MAX_WORK)) untenable, like below:
[ 370.785650] "echo 0 > /proc/sys/kernel/hungtasktimeoutsecs" disables this message. [ 370.787154] task:swapper/0 state:D stack: 0 pid: 1 ppid: 0 flags:0x00004000 [ 370.788865] Call Trace: [ 370.789374] <TASK> [ 370.789841] _schedule+0x482/0x1050 [ 370.790613] schedule+0x92/0x1a0 [ 370.791290] schedulepreemptdisabled+0x2c/0x50 [ 370.792256] _mutexlock.isra.0+0x757/0xec0 [ 370.793158] _mutexlockslowpath+0x1f/0x30 [ 370.794079] mutexlock+0x50/0x60 [ 370.794795] _devicedriverlock+0x2f/0x70 [ 370.795677] ? driverprobedevice+0xd0/0xd0 [ 370.796576] _driverattachasynchelper+0x1d/0xd0 [ 370.797318] ? driverprobedevice+0xd0/0xd0 [ 370.797957] asyncschedulenodedomain+0xa5/0xc0 [ 370.798652] asyncschedulenode+0x19/0x30 [ 370.799243] _driverattach+0x246/0x290 [ 370.799828] ? driverallowsasyncprobing+0xa0/0xa0 [ 370.800548] busforeachdev+0x9d/0x130 [ 370.801132] driverattach+0x22/0x30 [ 370.801666] busadddriver+0x290/0x340 [ 370.802246] driverregister+0x88/0x140 [ 370.802817] ? virtioscsiinit+0x116/0x116 [ 370.803425] scsiregisterdriver+0x1a/0x30 [ 370.804057] initsd+0x184/0x226 [ 370.804533] dooneinitcall+0x71/0x3a0 [ 370.805107] kernelinitfreeable+0x39a/0x43a [ 370.805759] ? restinit+0x150/0x150 [ 370.806283] kernelinit+0x26/0x230 [ 370.806799] retfrom_fork+0x1f/0x30
To fix the deadlock, move the asyncscheduledev outside devicelock, as we can see, in asyncschedulenodedomain, the parameter of queueworknode is systemunboundwq, so it can accept concurrent operations. which will also not change the code logic, and will not lead to deadlock.