CVE-2025-68310

Source
https://cve.org/CVERecord?id=CVE-2025-68310
Import Source
https://storage.googleapis.com/cve-osv-conversion/osv-output/CVE-2025-68310.json
JSON Data
https://api.osv.dev/v1/vulns/CVE-2025-68310
Downstream
Related
Published
2025-12-16T15:39:41.652Z
Modified
2026-03-13T04:00:58.357819Z
Summary
s390/pci: Avoid deadlock between PCI error recovery and mlx5 crdump
Details

In the Linux kernel, the following vulnerability has been resolved:

s390/pci: Avoid deadlock between PCI error recovery and mlx5 crdump

Do not block PCI config accesses through pcicfgaccesslock() when executing the s390 variant of PCI error recovery: Acquire just devicelock() instead of pcidevlock() as powerpc's EEH and generig PCI AER processing do.

During error recovery testing a pair of tasks was reported to be hung:

mlx5core 0000:00:00.1: mlx5healthtryrecover:338:(pid 5553): health recovery flow aborted, PCI reads still not working INFO: task kmcheck:72 blocked for more than 122 seconds. Not tainted 5.14.0-570.12.1.bringup7.el9.s390x #1 "echo 0 > /proc/sys/kernel/hungtasktimeout_secs" disables this message. task:kmcheck state:D stack:0 pid:72 tgid:72 ppid:2 flags:0x00000000 Call Trace: [<000000065256f030>] __schedule+0x2a0/0x590 [<000000065256f356>] schedule+0x36/0xe0 [<000000065256f572>] schedulepreemptdisabled+0x22/0x30 [<0000000652570a94>] __mutexlock.constprop.0+0x484/0x8a8 [<000003ff800673a4>] mlx5unloadone+0x34/0x58 [mlx5core] [<000003ff8006745c>] mlx5pcierrdetected+0x94/0x140 [mlx5core] [<0000000652556c5a>] zpcieventattempterrorrecovery+0xf2/0x398 [<0000000651b9184a>] __zpcieventerror+0x23a/0x2c0 INFO: task kworker/u1664:6:1514 blocked for more than 122 seconds. Not tainted 5.14.0-570.12.1.bringup7.el9.s390x #1 "echo 0 > /proc/sys/kernel/hungtasktimeoutsecs" disables this message. task:kworker/u1664:6 state:D stack:0 pid:1514 tgid:1514 ppid:2 flags:0x00000000 Workqueue: mlx5health0000:00:00.0 mlx5fwfatalreportererrwork [mlx5core] Call Trace: [<000000065256f030>] __schedule+0x2a0/0x590 [<000000065256f356>] schedule+0x36/0xe0 [<0000000652172e28>] pciwaitcfg+0x80/0xe8 [<0000000652172f94>] pcicfgaccesslock+0x74/0x88 [<000003ff800916b6>] mlx5vscgwlock+0x36/0x178 [mlx5core] [<000003ff80098824>] mlx5crdumpcollect+0x34/0x1c8 [mlx5core] [<000003ff80074b62>] mlx5fwfatalreporterdump+0x6a/0xe8 [mlx5core] [<0000000652512242>] devlinkhealthdodump.part.0+0x82/0x168 [<0000000652513212>] devlinkhealthreport+0x19a/0x230 [<000003ff80075a12>] mlx5fwfatalreportererrwork+0xba/0x1b0 [mlx5core]

No kernel log of the exact same error with an upstream kernel is available - but the very same deadlock situation can be constructed there, too:

  • task: kmcheck mlx5unloadone() tries to acquire devlink lock while the PCI error recovery code has set pdev->blockcfgaccess by way of pcicfgaccess_lock()
  • task: kworker mlx5crdumpcollect() tries to set blockcfgaccess through pcicfgaccesslock() while devlinkhealth_report() had acquired the devlink lock.

A similar deadlock situation can be reproduced by requesting a crdump with

devlink health dump show pci/<BDF> reporter fw_fatal

while PCI error recovery is executed on the same <BDF> physical function by mlx5core's pcierror_handlers. On s390 this can be injected with

zpcictl --reset-fw <BDF>

Tests with this patch failed to reproduce that second deadlock situation, the devlink command is rejected with "kernel answers: Permission denied" - and we get a kernel log message of:

mlx5core 1ed0:00:00.1: mlx5crdump_collect:50:(pid 254382): crdump: failed to lock vsc gw err -5

because the config read of VSC_SEMAPHORE is rejected by the underlying hardware.

Two prior attempts to address this issue have been discussed and ultimately rejected [see link], with the primary argument that s390's implementation of PCI error recovery is imposing restrictions that neither powerpc's EEH nor PCI AER handling need. Tests show that PCI error recovery on s390 is running to completion even without blocking access to PCI config space.

Database specific
{
    "cna_assigner": "Linux",
    "osv_generated_from": "https://github.com/CVEProject/cvelistV5/tree/main/cves/2025/68xxx/CVE-2025-68310.json"
}
References

Affected packages

Git / git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git

Affected ranges

Type
GIT
Repo
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
Events
Introduced
4cdf2f4e24ff0d345fc36ef6d6aec059333a261e
Fixed
d0df2503bc3c2be385ca2fd96585daad1870c7c5
Fixed
b63c061be622b17b495cbf78a6d5f2d4c3147f8e
Fixed
3591d56ea9bfd3e7fbbe70f749bdeed689d415f9
Fixed
54f938d9f5693af8ed586a08db4af5d9da1f0f2d
Fixed
0fd20f65df6aa430454a0deed8f43efa91c54835

Database specific

source
"https://storage.googleapis.com/cve-osv-conversion/osv-output/CVE-2025-68310.json"

Linux / Kernel

Package

Name
Kernel

Affected ranges

Type
ECOSYSTEM
Events
Introduced
5.16.0
Fixed
6.1.159
Type
ECOSYSTEM
Events
Introduced
6.2.0
Fixed
6.6.117
Type
ECOSYSTEM
Events
Introduced
6.7.0
Fixed
6.12.58
Type
ECOSYSTEM
Events
Introduced
6.13.0
Fixed
6.17.8

Database specific

source
"https://storage.googleapis.com/cve-osv-conversion/osv-output/CVE-2025-68310.json"