In the Linux kernel, the following vulnerability has been resolved:
dm: don't attempt to queue IO under RCU protection
dm looks up the table for IO based on the request type, with an assumption that if the request is marked REQNOWAIT, it's fine to attempt to submit that IO while under RCU read lock protection. This is not OK, as REQNOWAIT just means that we should not be sleeping waiting on other IO, it does not mean that we can't potentially schedule.
A simple test case demonstrates this quite nicely:
int main(int argc, char *argv[]) { struct iovec iov; int fd;
fd = open("/dev/dm-0", O_RDONLY | O_DIRECT);
posix_memalign(&iov.iov_base, 4096, 4096);
iov.iov_len = 4096;
preadv2(fd, &iov, 1, 0, RWF_NOWAIT);
return 0;
}
which will instantly spew:
BUG: sleeping function called from invalid context at include/linux/sched/mm.h:306 inatomic(): 0, irqsdisabled(): 0, nonblock: 0, pid: 5580, name: dm-nowait preemptcount: 0, expected: 0 RCU nest depth: 1, expected: 0 INFO: lockdep is turned off. CPU: 7 PID: 5580 Comm: dm-nowait Not tainted 6.6.0-rc1-g39956d2dcd81 #132 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 Call Trace: <TASK> dumpstacklvl+0x11d/0x1b0 _mightresched+0x3c3/0x5e0 ? preemptcountsub+0x150/0x150 mempoolalloc+0x1e2/0x390 ? mempoolresize+0x7d0/0x7d0 ? locksync+0x190/0x190 ? lockrelease+0x4b7/0x670 ? internalgetuserpagesfast+0x868/0x2d40 bioallocbioset+0x417/0x8c0 ? bvecalloc+0x200/0x200 ? internalgetuserpagesfast+0xb8c/0x2d40 bioallocclone+0x53/0x100 dmsubmitbio+0x27f/0x1a20 ? lockrelease+0x4b7/0x670 ? blktryenterqueue+0x1a0/0x4d0 ? dmdaxdirectaccess+0x260/0x260 ? rcuiswatching+0x12/0xb0 ? blktryenterqueue+0x1cc/0x4d0 _submitbio+0x239/0x310 ? _bioqueueenter+0x700/0x700 ? kvmclockgetcycles+0x40/0x60 ? ktimeget+0x285/0x470 submitbionoacctnocheck+0x4d9/0xb80 ? shouldfailrequest+0x80/0x80 ? preemptcountsub+0x150/0x150 ? lockrelease+0x4b7/0x670 ? _bioaddpage+0x143/0x2d0 ? ioviterrevert+0x27/0x360 submitbionoacct+0x53e/0x1b30 submitbiowait+0x10a/0x230 ? submitbiowaitendio+0x40/0x40 _blkdevdirectIOsimple+0x4f8/0x780 ? blkdevbioendio+0x4c0/0x4c0 ? stacktracesave+0x90/0xc0 ? _bioclone+0x3c0/0x3c0 ? lockrelease+0x4b7/0x670 ? locksync+0x190/0x190 ? atimeneedsupdate+0x3bf/0x7e0 ? timestamptruncate+0x21b/0x2d0 ? inodeownerorcapable+0x240/0x240 blkdevdirectIO.part.0+0x84a/0x1810 ? rcuiswatching+0x12/0xb0 ? lockrelease+0x4b7/0x670 ? blkdevreaditer+0x40d/0x530 ? reacquireheldlocks+0x4e0/0x4e0 ? _blkdevdirectIOsimple+0x780/0x780 ? rcuiswatching+0x12/0xb0 ? _markinodedirty+0x297/0xd50 ? preemptcountadd+0x72/0x140 blkdevreaditer+0x2a4/0x530 doiterreadvwritev+0x2f2/0x3c0 ? genericcopyfilerange+0x1d0/0x1d0 ? fsnotifyperm.part.0+0x25d/0x630 ? securityfilepermission+0xd8/0x100 doiterread+0x31b/0x880 ? importiovec+0x10b/0x140 vfsreadv+0x12d/0x1a0 ? vfsiterread+0xb0/0xb0 ? rcuiswatching+0x12/0xb0 ? rcuiswatching+0x12/0xb0 ? lockrelease+0x4b7/0x670 dopreadv+0x1b3/0x260 ? doreadv+0x370/0x370 _x64syspreadv2+0xef/0x150 dosyscall64+0x39/0xb0 entrySYSCALL64afterhwframe+0x63/0xcd RIP: 0033:0x7f5af41ad806 Code: 41 54 41 89 fc 55 44 89 c5 53 48 89 cb 48 83 ec 18 80 3d e4 dd 0d 00 00 74 7a 45 89 c1 49 89 ca 45 31 c0 b8 47 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 be 00 00 00 48 85 c0 79 4a 48 8b 0d da 55 RSP: 002b:00007ffd3145c7f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000147 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f5af41ad806 RDX: 0000000000000001 RSI: 00007ffd3145c850 RDI: 0000000000000003 RBP: 0000000000000008 R08: 0000000000000000 R09: 0000000000000008 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003 R13: 00007ffd3145c850 R14: 000055f5f0431dd8 R15: 0000000000000001 </TASK>
where in fact it is ---truncated---
{
"osv_generated_from": "https://github.com/CVEProject/cvelistV5/tree/main/cves/2023/53xxx/CVE-2023-53860.json",
"cna_assigner": "Linux"
}