In the Linux kernel, the following vulnerability has been resolved:
btrfs: fix deadlock between quota disable and qgroup rescan worker
Quota disable ioctl starts a transaction before waiting for the qgroup rescan worker completes. However, this wait can be infinite and results in deadlock because of circular dependency among the quota disable ioctl, the qgroup rescan worker and the other task with transaction such as block group relocation task.
The deadlock happens with the steps following:
1) Task A calls ioctl to disable quota. It starts a transaction and waits for qgroup rescan worker completes. 2) Task B such as block group relocation task starts a transaction and joins to the transaction that task A started. Then task B commits to the transaction. In this commit, task B waits for a commit by task A. 3) Task C as the qgroup rescan worker starts its job and starts a transaction. In this transaction start, task C waits for completion of the transaction that task A started and task B committed.
This deadlock was found with fstests test case btrfs/115 and a zoned nullblk device. The test case enables and disables quota, and the block group reclaim was triggered during the quota disable by chance. The deadlock was also observed by running quota enable and disable in parallel with 'btrfs balance' command on regular nullblk devices.
An example report of the deadlock:
[372.469894] INFO: task kworker/u16:6:103 blocked for more than 122 seconds. [372.479944] Not tainted 5.16.0-rc8 #7 [372.485067] "echo 0 > /proc/sys/kernel/hungtasktimeoutsecs" disables this message. [372.493898] task:kworker/u16:6 state:D stack: 0 pid: 103 ppid: 2 flags:0x00004000 [372.503285] Workqueue: btrfs-qgroup-rescan btrfsworkhelper [btrfs] [372.510782] Call Trace: [372.514092] <TASK> [372.521684] _schedule+0xb56/0x4850 [372.530104] ? ioscheduletimeout+0x190/0x190 [372.538842] ? lockdephardirqson+0x7e/0x100 [372.547092] ? rawspinunlockirqrestore+0x3e/0x60 [372.555591] schedule+0xe0/0x270 [372.561894] btrfscommittransaction+0x18bb/0x2610 [btrfs] [372.570506] ? btrfsapplypendingchanges+0x50/0x50 [btrfs] [372.578875] ? freeunrefpage+0x3f2/0x650 [372.585484] ? finishwait+0x270/0x270 [372.591594] ? releaseextentbuffer+0x224/0x420 [btrfs] [372.599264] btrfsqgrouprescanworker+0xc13/0x10c0 [btrfs] [372.607157] ? lockrelease+0x3a9/0x6d0 [372.613054] ? btrfsqgroupaccountextent+0xda0/0xda0 [btrfs] [372.620960] ? dorawspinlock+0x11e/0x250 [372.627137] ? rwlockbug.part.0+0x90/0x90 [372.633215] ? lockisheldtype+0xe4/0x140 [372.639404] btrfsworkhelper+0x1ae/0xa90 [btrfs] [372.646268] processonework+0x7e9/0x1320 [372.652321] ? lockrelease+0x6d0/0x6d0 [372.658081] ? pwqdecnrinflight+0x230/0x230 [372.664513] ? rwlockbug.part.0+0x90/0x90 [372.670529] workerthread+0x59e/0xf90 [372.676172] ? processonework+0x1320/0x1320 [372.682440] kthread+0x3b9/0x490 [372.687550] ? _rawspinunlockirq+0x24/0x50 [372.693811] ? setkthreadstruct+0x100/0x100 [372.700052] retfromfork+0x22/0x30 [372.705517] </TASK> [372.709747] INFO: task btrfs-transacti:2347 blocked for more than 123 seconds. [372.729827] Not tainted 5.16.0-rc8 #7 [372.745907] "echo 0 > /proc/sys/kernel/hungtasktimeoutsecs" disables this message. [372.767106] task:btrfs-transacti state:D stack: 0 pid: 2347 ppid: 2 flags:0x00004000 [372.787776] Call Trace: [372.801652] <TASK> [372.812961] _schedule+0xb56/0x4850 [372.830011] ? ioscheduletimeout+0x190/0x190 [372.852547] ? lockdephardirqson+0x7e/0x100 [372.871761] ? rawspinunlockirqrestore+0x3e/0x60 [372.886792] schedule+0xe0/0x270 [372.901685] waitcurrenttrans+0x22c/0x310 [btrfs] [372.919743] ? btrfsputtransaction+0x3d0/0x3d0 [btrfs] [372.938923] ? finishwait+0x270/0x270 [372.959085] ? jointransaction+0xc7 ---truncated---