In the Linux kernel, the following vulnerability has been resolved:
RDMA/cma: Ensure rdmaaddrcancel() happens before issuing more requests
The FSM can run in a circle allowing rdmaresolveip() to be called twice on the same id_priv. While this cannot happen without going through the work, it violates the invariant that the same address resolution background request cannot be active twice.
CPU 1 CPU 2
rdmaresolveaddr(): RDMACMIDLE -> RDMACMADDRQUERY rdmaresolveip(addrhandler) #1
process_one_req(): for #1
addr_handler():
RDMA_CM_ADDR_QUERY -> RDMA_CM_ADDR_BOUND
mutex_unlock(&id_priv->handler_mutex);
[.. handler still running ..]
rdmaresolveaddr(): RDMACMADDRBOUND -> RDMACMADDRQUERY rdmaresolveip(addrhandler) !! two requests are now on the reqlist
rdmadestroyid(): destroyidhandlerunlock(): _destroyid(): cmacanceloperation(): rdmaaddrcancel()
// process_one_req() self removes it
spin_lock_bh(&lock);
cancel_delayed_work(&req->work);
if (!list_empty(&req->list)) == true
! rdma_addr_cancel() returns after process_on_req #1 is done
kfree(id_priv)
process_one_req(): for #2
addr_handler():
mutex_lock(&id_priv->handler_mutex);
!! Use after free on id_priv
rdmaaddrcancel() expects there to be one req on the list and only cancels the first one. The self-removal behavior of the work only happens after the handler has returned. This yields a situations where the reqlist can have two reqs for the same "handle" but rdmaaddr_cancel() only cancels the first one.
The second req remains active beyond rdmadestroyid() and will use-after-free id_priv once it inevitably triggers.
Fix this by remembering if the idpriv has called rdmaresolveip() and always cancel before calling it again. This ensures the reqlist never gets more than one item in it and doesn't cost anything in the normal flow that never uses this strange error path.