CVE-2023-53836

There is a race where skb's from the skpsockbacklog can be referenced after userspace side has already skbconsumed() the skbuff and its refcnt dropped to zer0 causing use after free.

The flow is the following:

while ((skb = skbpeek(&psock->ingressskb)) skpsockhandleSkb(psock, skb, ..., ingress) if (!ingress) ... skpsockskbingress skpsockskbingressenqueue(skb) msg->skb = skb skpsockqueuemsg(psock, msg) skbdequeue(&psock->ingress_skb)

The skpsockqueuemsg() puts the msg on the ingressmsg queue. This is what the application reads when recvmsg() is called. An application can read this anytime after the msg is placed on the queue. The recvmsg hook will also read msg->skb and then after user space reads the msg will call consume_skb(skb) on it effectively free'ing it.

But, the race is in above where backlog queue still has a reference to the skb and calls skbdequeue(). If the skbdequeue happens after the user reads and free's the skb we have a use after free.

The !ingress case does not suffer from this problem because it uses sendmsg_*(sk, msg) which does not pass the sk_buff further down the stack.

The following splat was observed with 'testprogs -t sockmaplisten':

[ 1022.710250][ T2556] general protection fault, ... [...] [ 1022.712830][ T2556] Workqueue: events skpsockbacklog [ 1022.713262][ T2556] RIP: 0010:skbdequeue+0x4c/0x80 [ 1022.713653][ T2556] Code: ... [...] [ 1022.720699][ T2556] Call Trace: [ 1022.720984][ T2556] <TASK> [ 1022.721254][ T2556] ? dieaddr+0x32/0x80^M [ 1022.721589][ T2556] ? excgeneralprotection+0x25a/0x4b0 [ 1022.722026][ T2556] ? asmexcgeneralprotection+0x22/0x30 [ 1022.722489][ T2556] ? skbdequeue+0x4c/0x80 [ 1022.722854][ T2556] skpsockbacklog+0x27a/0x300 [ 1022.723243][ T2556] processonework+0x2a7/0x5b0 [ 1022.723633][ T2556] worker_thread+0x4f/0x3a0 [ 1022.723998][ T2556] ? __pfxworkerthread+0x10/0x10 [ 1022.724386][ T2556] kthread+0xfd/0x130 [ 1022.724709][ T2556] ? __pfxkthread+0x10/0x10 [ 1022.725066][ T2556] retfrom_fork+0x2d/0x50 [ 1022.725409][ T2556] ? __pfxkthread+0x10/0x10 [ 1022.725799][ T2556] retfromforkasm+0x1b/0x30 [ 1022.726201][ T2556] </TASK>

To fix we add an skbget() before passing the skb to be enqueued in the engress queue. This bumps the skb->users refcnt so that consumeskb() and kfreeskb will not immediately free the skbuff. With this we can be sure the skb is still around when we do the dequeue. Then we just need to decrement the refcnt or free the skb in the backlog case which we do by calling kfree_skb() on the ingress case as well as the sendmsg case.

Before locking change from fixes tag we had the sock locked so we couldn't race with user and there was no issue here.

Database specific

{
    "osv_generated_from": "https://github.com/CVEProject/cvelistV5/tree/main/cves/2023/53xxx/CVE-2023-53836.json",
    "cna_assigner": "Linux"
}

References

Affected packages

Git / git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git

Affected ranges

Type: GIT
Repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
Events: Introduced

799aa7f98d53e0f541fa6b4dc9aa47b4ff2178e3

Fixed

65ad600b9bde68d2d28709943ab00b51ca8f0a1d

Fixed

923877254f002ae87d441382bb1096d9e773d56d

Fixed

e6b5e47adb9166e732cdf7e6e034946e3f89f36d

Fixed

a454d84ee20baf7bd7be90721b9821f73c7d23d9

Database specific

source

"https://storage.googleapis.com/cve-osv-conversion/osv-output/CVE-2023-53836.json"

Linux / Kernel

Package

Name: Kernel

Affected ranges

Type: ECOSYSTEM
Events: Introduced

5.13.0

Fixed

5.15.189

Type: ECOSYSTEM
Events: Introduced

5.16.0

Fixed

6.1.54

Type: ECOSYSTEM
Events: Introduced

6.2.0

Fixed

6.5.4

Database specific

source

"https://storage.googleapis.com/cve-osv-conversion/osv-output/CVE-2023-53836.json"