In the Linux kernel, the following vulnerability has been resolved:
drm/amdgpu: handle the case of pcichanneliofrozen only in amdgpupci_resume
In current code, when a PCI error state pcichannelionormal is detectd, it will report PCIERSRESULTCANRECOVER status to PCI driver, and PCI driver will continue the execution of PCI resume callback reportresume by pciwalkbridge, and the callback will go into amdgpupciresume finally, where write lock is releasd unconditionally without acquiring such lock first. In this case, a deadlock will happen when other threads start to acquire the read lock.
To fix this, add a member in amdgpudevice strucutre to cache pcichannelstate, and only continue the execution in amdgpupciresume when it's pcichanneliofrozen.