In the Linux kernel, the following vulnerability has been resolved:
tcp/dccp: Don't use timerpending() in reqskqueue_unlink().
Martin KaFai Lau reported use-after-free [0] in reqsktimerhandler().
""" We are seeing a use-after-free from a bpf prog attached to tracetcpretransmitsynack. The program passes the req->sk to the bpfskstorageget_tracing kernel helper which does check for null before using it. """
The commit 83fccfc3940c ("inet: fix potential deadlock in reqskqueueunlink()") added timerpending() in reqskqueueunlink() not to call deltimersync() from reqsktimer_handler(), but it introduced a small race window.
Before the timer is called, expiretimers() calls detachtimer(timer, true) to clear timer->entry.pprev and marks it as not pending.
If reqskqueueunlink() checks timerpending() just after expiretimers() calls detachtimer(), TCP will miss deltimer_sync(); the reqsk timer will continue running and send multiple SYN+ACKs until it expires.
The reported UAF could happen if req->sk is close()d earlier than the timer expiration, which is 63s by default.
The scenario would be
inetcskcompletehashdance() calls inetcskreqskqueuedrop(), but deltimer_sync() is missed
reqsk timer is executed and scheduled again
req->sk is accept()ed and reqskput() decrements rskrefcnt, but reqsk timer still has another one, and inetcskaccept() does not clear req->sk for non-TFO sockets
sk is close()d
reqsk timer is executed again, and BPF touches req->sk
Let's not use timerpending() by passing the caller context to _inetcskreqskqueuedrop().
Note that reqsk timer is pinned, so the issue does not happen in most use cases. [1]
[0] BUG: KFENCE: use-after-free read in bpfskstoragegettracing+0x2e/0x1b0
Use-after-free read at 0x00000000a891fb3a (in kfence-#1): bpfskstoragegettracing+0x2e/0x1b0 bpfprog5ea3e95db6da0438tcpretransmitsynack+0x1d20/0x1dda bpftracerun2+0x4c/0xc0 tcprtxsynack+0xf9/0x100 reqsktimerhandler+0xda/0x3d0 runtimersoftirq+0x292/0x8a0 irqexitrcu+0xf5/0x320 sysvecapictimerinterrupt+0x6d/0x80 asmsysvecapictimerinterrupt+0x16/0x20 intelidleirq+0x5a/0xa0 cpuidleenterstate+0x94/0x273 cpustartupentry+0x15e/0x260 startsecondary+0x8a/0x90 secondarystartup64no_verify+0xfa/0xfb
kfence-#1: 0x00000000a72cc7b6-0x00000000d97616d9, size=2376, cache=TCPv6
allocated by task 0 on cpu 9 at 260507.901592s: skprotalloc+0x35/0x140 skclonelock+0x1f/0x3f0 inetcskclonelock+0x15/0x160 tcpcreateopenreqchild+0x1f/0x410 tcpv6synrecvsock+0x1da/0x700 tcpcheckreq+0x1fb/0x510 tcpv6rcv+0x98b/0x1420 ipv6listrcv+0x2258/0x26e0 napicompletedone+0x5b1/0x2990 mlx5enapipoll+0x2ae/0x8d0 netrxaction+0x13e/0x590 irqexitrcu+0xf5/0x320 commoninterrupt+0x80/0x90 asmcommoninterrupt+0x22/0x40 cpuidleenterstate+0xfb/0x273 cpustartupentry+0x15e/0x260 startsecondary+0x8a/0x90 secondarystartup64noverify+0xfa/0xfb
freed by task 0 on cpu 9 at 260507.927527s: rcucoresi+0x4ff/0xf10 irqexitrcu+0xf5/0x320 sysvecapictimerinterrupt+0x6d/0x80 asmsysvecapictimerinterrupt+0x16/0x20 cpuidleenterstate+0xfb/0x273 cpustartupentry+0x15e/0x260 startsecondary+0x8a/0x90 secondarystartup64noverify+0xfa/0xfb