In the Linux kernel, the following vulnerability has been resolved:
nvme: fix reconnection fail due to reserved tag allocation
We found a issue on production environment while using NVMe over RDMA, adminq reconnect failed forever while remote target and network is ok. After dig into it, we found it may caused by a ABBA deadlock due to tag allocation. In my case, the tag was hold by a keep alive request waiting inside adminq, as we quiesced adminq while reset ctrl, so the request maked as idle and will not process before reset success. As fabricq shares tagset with adminq, while reconnect remote target, we need a tag for connect command, but the only one reserved tag was held by keep alive command which waiting inside adminq. As a result, we failed to reconnect admin_q forever. In order to fix this issue, I think we should keep two reserved tags for admin queue.