In the Linux kernel, the following vulnerability has been resolved:
afunix: Call kfreeskb() for dead unix(sk)->oobskb in GC.
syzbot reported a warning [0] in _unixgc() with a repro, which creates a socketpair and sends one socket's fd to itself using the peer.
socketpair(AFUNIX, SOCKSTREAM, 0, [3, 4]) = 0 sendmsg(4, {msgname=NULL, msgnamelen=0, msgiov=[{iovbase="\360", iovlen=1}], msgiovlen=1, msgcontrol=[{cmsglen=20, cmsglevel=SOLSOCKET, cmsgtype=SCMRIGHTS, cmsgdata=[3]}], msgcontrollen=24, msgflags=0}, MSGOOB|MSGPROBE|MSGDONTWAIT|MSG_ZEROCOPY) = 1
This forms a self-cyclic reference that GC should finally untangle but does not due to lack of MSG_OOB handling, resulting in memory leak.
Recently, commit 11498715f266 ("afunix: Remove iouring code for GC.") removed io_uring's dead code in GC and revealed the problem.
The code was executed at the final stage of GC and unconditionally moved all GC candidates from gccandidates to gcinflightlist. That papered over the reported problem by always making the following WARNONONCE(!listempty(&gc_candidates)) false.
The problem has been there since commit 2aab4b969002 ("afunix: fix struct pid leaks in OOB support") added full scm support for MSGOOB while fixing another bug.
To fix this problem, we must call kfreeskb() for unixsk(sk)->oobskb if the socket still exists in gccandidates after purging collected skb.
Then, we need to set NULL to oobskb before calling kfreeskb() because it calls last fput() and triggers unixreleasesock(), where we call duplicate kfreeskb(u->oobskb) if not NULL.
Note that the leaked socket remained being linked to a global list, so kmemleak also could not detect it. We need to check /proc/net/protocol to notice the unfreed socket.
Modules linked in: CPU: 0 PID: 2863 Comm: kworker/u4:11 Not tainted 6.8.0-rc1-syzkaller-00583-g1701940b1a02 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024 Workqueue: eventsunbound unixgc RIP: 0010:unixgc+0xc74/0xe80 net/unix/garbage.c:345 Code: 8b 5c 24 50 e9 86 f8 ff ff e8 f8 e4 22 f8 31 d2 48 c7 c6 30 6a 69 89 4c 89 ef e8 97 ef ff ff e9 80 f9 ff ff e8 dd e4 22 f8 90 <0f> 0b 90 e9 7b fd ff ff 48 89 df e8 5c e7 7c f8 e9 d3 f8 ff ff e8 RSP: 0018:ffffc9000b03fba0 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffffc9000b03fc10 RCX: ffffffff816c493e RDX: ffff88802c02d940 RSI: ffffffff896982f3 RDI: ffffc9000b03fb30 RBP: ffffc9000b03fce0 R08: 0000000000000001 R09: fffff52001607f66 R10: 0000000000000003 R11: 0000000000000002 R12: dffffc0000000000 R13: ffffc9000b03fc10 R14: ffffc9000b03fc10 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005559c8677a60 CR3: 000000000d57a000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> processonework+0x889/0x15e0 kernel/workqueue.c:2633 processscheduledworks kernel/workqueue.c:2706 [inline] workerthread+0x8b9/0x12a0 kernel/workqueue.c:2787 kthread+0x2c6/0x3b0 kernel/kthread.c:388 retfromfork+0x45/0x80 arch/x86/kernel/process.c:147 retfromforkasm+0x1b/0x30 arch/x86/entry/entry64.S:242 </TASK>