In the Linux kernel, the following vulnerability has been resolved:
scsi: core: Move scsihostbusy() out of host lock for waking up EH handler
Inside scsiehwakeup(), scsihostbusy() is called & checked with host lock every time for deciding if error handler kthread needs to be waken up.
This can be too heavy in case of recovery, such as:
N hardware queues
queue depth is M for each hardware queue
each scsihostbusy() iterates over (N * M) tag/requests
If recovery is triggered in case that all requests are in-flight, each scsiehwakeup() is strictly serialized, when scsiehwakeup() is called for the last in-flight request, scsihostbusy() has been run for (N * M - 1) times, and request has been iterated for (N*M - 1) * (N * M) times.
If both N and M are big enough, hard lockup can be triggered on acquiring host lock, and it is observed on mpi3mr(128 hw queues, queue depth 8169).
Fix the issue by calling scsihostbusy() outside the host lock. We don't need the host lock for getting busy count because host the lock never covers that.
[mkp: Drop unnecessary 'busy' variables pointed out by Bart]