In the Linux kernel, the following vulnerability has been resolved:
scsi: hisi_sas: Fix a deadlock issue related to automatic dump
If we issue a disabling PHY command, the device attached with it will go offline, if a 2 bit ECC error occurs at the same time, a hung task may be found:
[ 4613.652388] INFO: task kworker/u256:0:165233 blocked for more than 120 seconds. [ 4613.666297] "echo 0 > /proc/sys/kernel/hungtasktimeoutsecs" disables this message. [ 4613.674809] task:kworker/u256:0 state:D stack: 0 pid:165233 ppid: 2 flags:0x00000208 [ 4613.683959] Workqueue: 0000:74:02.0discoq sasrevalidatedomain [libsas] [ 4613.691518] Call trace: [ 4613.694678] _switchto+0xf8/0x17c [ 4613.698872] _schedule+0x660/0xee0 [ 4613.703063] schedule+0xac/0x240 [ 4613.706994] scheduletimeout+0x500/0x610 [ 4613.711705] _down+0x128/0x36c [ 4613.715548] down+0x240/0x2d0 [ 4613.719221] hisisasinternalaborttimeout+0x1bc/0x260 [hisisasmain] [ 4613.726618] sasexecuteinternalabort+0x144/0x310 [libsas] [ 4613.732976] sasexecuteinternalabortdev+0x44/0x60 [libsas] [ 4613.739504] hisisasinternaltaskabortdev.isra.0+0xbc/0x1b0 [hisisasmain] [ 4613.747499] hisisasdevgone+0x174/0x250 [hisisasmain] [ 4613.753682] sasnotifylldddevgone+0xec/0x2e0 [libsas] [ 4613.759781] sasunregistercommondev+0x4c/0x7a0 [libsas] [ 4613.765962] sasdestructdevices+0xb8/0x120 [libsas] [ 4613.771709] sasdorevalidatedomain.constprop.0+0x1b8/0x31c [libsas] [ 4613.778930] sasrevalidatedomain+0x60/0xa4 [libsas] [ 4613.784716] processonework+0x248/0x950 [ 4613.789424] workerthread+0x318/0x934 [ 4613.793878] kthread+0x190/0x200 [ 4613.797810] retfromfork+0x10/0x18 [ 4613.802121] INFO: task kworker/u256:4:316722 blocked for more than 120 seconds. [ 4613.816026] "echo 0 > /proc/sys/kernel/hungtasktimeoutsecs" disables this message. [ 4613.824538] task:kworker/u256:4 state:D stack: 0 pid:316722 ppid: 2 flags:0x00000208 [ 4613.833670] Workqueue: 0000:74:02.0 hisisasrstworkhandler [hisisasmain] [ 4613.841491] Call trace: [ 4613.844647] _switchto+0xf8/0x17c [ 4613.848852] _schedule+0x660/0xee0 [ 4613.853052] schedule+0xac/0x240 [ 4613.856984] scheduletimeout+0x500/0x610 [ 4613.861695] _down+0x128/0x36c [ 4613.865542] down+0x240/0x2d0 [ 4613.869216] hisisascontrollerprereset+0x58/0x1fc [hisisasmain] [ 4613.876324] hisisasrstworkhandler+0x40/0x8c [hisisasmain] [ 4613.883019] processonework+0x248/0x950 [ 4613.887732] workerthread+0x318/0x934 [ 4613.892204] kthread+0x190/0x200 [ 4613.896118] retfromfork+0x10/0x18 [ 4613.900423] INFO: task kworker/u256:1:348985 blocked for more than 121 seconds. [ 4613.914341] "echo 0 > /proc/sys/kernel/hungtasktimeoutsecs" disables this message. [ 4613.922852] task:kworker/u256:1 state:D stack: 0 pid:348985 ppid: 2 flags:0x00000208 [ 4613.931984] Workqueue: 0000:74:02.0eventq sasporteventworker [libsas] [ 4613.939549] Call trace: [ 4613.942702] _switchto+0xf8/0x17c [ 4613.946892] _schedule+0x660/0xee0 [ 4613.951083] schedule+0xac/0x240 [ 4613.955015] scheduletimeout+0x500/0x610 [ 4613.959725] waitforcommon+0x200/0x610 [ 4613.964349] waitforcompletion+0x3c/0x5c [ 4613.969146] flushworkqueue+0x198/0x790 [ 4613.973776] sasportebroadcastrcvd+0x1e8/0x320 [libsas] [ 4613.979960] sasporteventworker+0x54/0xa0 [libsas] [ 4613.985708] processonework+0x248/0x950 [ 4613.990420] workerthread+0x318/0x934 [ 4613.994868] kthread+0x190/0x200 [ 4613.998800] retfromfork+0x10/0x18
This is because when the device goes offline, we obtain the hisihba semaphore and send the ABORTDEV command to the device. However, the internal abort timed out due to the 2 bit ECC error and triggers automatic dump. In addition, since the hisi_hba semaphore has been obtained, the dump cannot be executed and the controller cannot be reset.
Therefore, the deadlocks occur on the following circular dependencies ---truncated---