In the Linux kernel, the following vulnerability has been resolved:
btrfs: fix race between balance and cancel/pause
Syzbot reported a panic that looks like this:
assertion failed: fsinfo->exclusiveoperation == BTRFSEXCLOPBALANCEPAUSED, in fs/btrfs/ioctl.c:465 ------------[ cut here ]------------ kernel BUG at fs/btrfs/messages.c:259! RIP: 0010:btrfsassertfail+0x2c/0x30 fs/btrfs/messages.c:259 Call Trace: <TASK> btrfsexclopbalance fs/btrfs/ioctl.c:465 [inline] btrfsioctlbalance fs/btrfs/ioctl.c:3564 [inline] btrfsioctl+0x531e/0x5b30 fs/btrfs/ioctl.c:4632 vfsioctl fs/ioctl.c:51 [inline] __dosysioctl fs/ioctl.c:870 [inline] __sesysioctl fs/ioctl.c:856 [inline] __x64sysioctl+0x197/0x210 fs/ioctl.c:856 dosyscallx64 arch/x86/entry/common.c:50 [inline] dosyscall64+0x39/0xb0 arch/x86/entry/common.c:80 entrySYSCALL64afterhwframe+0x63/0xcd
The reproducer is running a balance and a cancel or pause in parallel. The way balance finishes is a bit wonky, if we were paused we need to save the balancectl in the fsinfo, but clear it otherwise and cleanup. However we rely on the return values being specific errors, or having a cancel request or no pause request. If balance completes and returns 0, but we have a pause or cancel request we won't do the appropriate cleanup, and then the next time we try to start a balance we'll trip this ASSERT.
The error handling is just wrong here, we always want to clean up, unless we got -ECANCELLED and we set the appropriate pause flag in the exclusive op. With this patch the reproducer ran for an hour without tripping, previously it would trip in less than a few minutes.
{
"cna_assigner": "Linux",
"osv_generated_from": "https://github.com/CVEProject/cvelistV5/tree/main/cves/2023/54xxx/CVE-2023-54023.json"
}