In the Linux kernel, the following vulnerability has been resolved:
bpf: bpfskstorage: Fix invalid wait context lockdep report
'./testprogs -t testlocal_storage' reported a splat:
[ 27.137569] ============================= [ 27.138122] [ BUG: Invalid wait context ] [ 27.138650] 6.5.0-03980-gd11ae1b16b0a #247 Tainted: G O [ 27.139542] ----------------------------- [ 27.140106] testprogs/1729 is trying to lock: [ 27.140713] ffff8883ef047b88 (stocklock){-.-.}-{3:3}, at: locallockacquire+0x9/0x130 [ 27.141834] other info that might help us debug this: [ 27.142437] context-{5:5} [ 27.142856] 2 locks held by testprogs/1729: [ 27.143352] #0: ffffffff84bcd9c0 (rcureadlock){....}-{1:3}, at: rculockacquire+0x4/0x40 [ 27.144492] #1: ffff888107deb2c0 (&storage->lock){..-.}-{2:2}, at: bpflocalstorageupdate+0x39e/0x8e0 [ 27.145855] stack backtrace: [ 27.146274] CPU: 0 PID: 1729 Comm: testprogs Tainted: G O 6.5.0-03980-gd11ae1b16b0a #247 [ 27.147550] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 27.149127] Call Trace: [ 27.149490] <TASK> [ 27.149867] dumpstacklvl+0x130/0x1d0 [ 27.152609] dumpstack+0x14/0x20 [ 27.153131] _lockacquire+0x1657/0x2220 [ 27.153677] lockacquire+0x1b8/0x510 [ 27.157908] locallockacquire+0x29/0x130 [ 27.159048] objcgroupcharge+0xf4/0x3c0 [ 27.160794] slabpreallochook+0x28e/0x2b0 [ 27.161931] _kmemcacheallocnode+0x51/0x210 [ 27.163557] _kmalloc+0xaa/0x210 [ 27.164593] bpfmapkzalloc+0xbc/0x170 [ 27.165147] bpfselemalloc+0x130/0x510 [ 27.166295] bpflocalstorageupdate+0x5aa/0x8e0 [ 27.167042] bpffdskstorageupdateelem+0xdb/0x1a0 [ 27.169199] bpfmapupdatevalue+0x415/0x4f0 [ 27.169871] mapupdateelem+0x413/0x550 [ 27.170330] _sysbpf+0x5e9/0x640 [ 27.174065] _x64sysbpf+0x80/0x90 [ 27.174568] dosyscall64+0x48/0xa0 [ 27.175201] entrySYSCALL64afterhwframe+0x6e/0xd8 [ 27.175932] RIP: 0033:0x7effb40e41ad [ 27.176357] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d8 [ 27.179028] RSP: 002b:00007ffe64c21fc8 EFLAGS: 00000202 ORIGRAX: 0000000000000141 [ 27.180088] RAX: ffffffffffffffda RBX: 00007ffe64c22768 RCX: 00007effb40e41ad [ 27.181082] RDX: 0000000000000020 RSI: 00007ffe64c22008 RDI: 0000000000000002 [ 27.182030] RBP: 00007ffe64c21ff0 R08: 0000000000000000 R09: 00007ffe64c22788 [ 27.183038] R10: 0000000000000064 R11: 0000000000000202 R12: 0000000000000000 [ 27.184006] R13: 00007ffe64c22788 R14: 00007effb42a1000 R15: 0000000000000000 [ 27.184958] </TASK>
It complains about acquiring a locallock while holding a rawspinlock. It means it should not allocate memory while holding a rawspin_lock since it is not safe for RT.
rawspinlock is needed because bpflocalstorage supports tracing context. In particular for task local storage, it is easy to get a "current" task PTRTOBTFID in tracing bpf prog. However, task (and cgroup) local storage has already been moved to bpf mem allocator which can be used after rawspin_lock.
The splat is for the sk storage. For sk (and inode) storage, it has not been moved to bpf mem allocator. Using rawspinlock or not, kzalloc(GFPATOMIC) could theoretically be unsafe in tracing context. However, the local storage helper requires a verifier accepted sk pointer (PTRTOBTFID), it is hypothetical if that (mean running a bpf prog in a kzalloc unsafe context and also able to hold a verifier accepted sk pointer) could happen.
This patch avoids kzalloc after rawspinlock to silent the splat. There is an existing kzalloc before the rawspinlock. At that point, a kzalloc is very likely required because a lookup has just been done before. Thus, this patch always does the kzalloc before acq ---truncated---
{
"cna_assigner": "Linux",
"osv_generated_from": "https://github.com/CVEProject/cvelistV5/tree/main/cves/2023/53xxx/CVE-2023-53857.json"
}