In the Linux kernel, the following vulnerability has been resolved:
efi/unaccepted: touch soft lockup during memory accept
Commit 50e782a86c98 ("efi/unaccepted: Fix soft lockups caused by parallel memory acceptance") has released the spinlock so other CPUs can do memory acceptance in parallel and not triggers softlockup on other CPUs.
However the softlock up was intermittent shown up if the memory of the TD guest is large, and the timeout of softlockup is set to 1 second:
RIP: 0010:rawspinunlockirqrestore Call Trace: ? _hrtimerrunqueues <IRQ> ? hrtimerinterrupt ? watchdogtimerfn ? _sysvecapictimerinterrupt ? _pfxwatchdogtimerfn ? sysvecapictimerinterrupt </IRQ> ? _hrtimerrunqueues <TASK> ? hrtimerinterrupt ? asmsysvecapictimerinterrupt ? _rawspinunlockirqrestore ? _sysvecapictimerinterrupt ? sysvecapictimerinterrupt acceptmemory trytoacceptmemory dohugepmdanonymouspage getpagefromfreelist _handlemmfault _allocpages _folioalloc ? _tdxhypercall handlemmfault vmaallocfolio douseraddrfault dohugepmdanonymouspage excpagefault ? _dohugepmdanonymouspage asmexcpagefault _handlemm_fault
When the local irq is enabled at the end of accept_memory(), the softlockup detects that the watchdog on single CPU has not been fed for a while. That is to say, even other CPUs will not be blocked by spinlock, the current CPU might be stunk with local irq disabled for a while, which hurts not only nmi watchdog but also softlockup.
Chao Gao pointed out that the memory accept could be time costly and there was similar report before. Thus to avoid any softlocup detection during this stage, give the softlockup a flag to skip the timeout check at the end of acceptmemory(), by invoking touchsoftlockup_watchdog().