In the Linux kernel, the following vulnerability has been resolved:
net/mlx5: Always stop health timer during driver removal
Currently, if teardown_hca fails to execute during driver removal, mlx5 does not stop the health timer. Afterwards, mlx5 continue with driver teardown. This may lead to a UAF bug, which results in page fault Oops[1], since the health timer invokes after resources were freed.
Hence, stop the health monitor even if teardown_hca fails.
[1] mlx5core 0000:18:00.0: E-Switch: Unload vfs: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) mlx5core 0000:18:00.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) mlx5core 0000:18:00.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) mlx5core 0000:18:00.0: E-Switch: cleanup mlx5core 0000:18:00.0: waitfunc:1155:(pid 1967079): TEARDOWNHCA(0x103) timeout. Will cause a leak of a command resource mlx5core 0000:18:00.0: mlx5functionclose:1288:(pid 1967079): teardownhca failed, skip cleanup BUG: unable to handle page fault for address: ffffa26487064230 PGD 100c00067 P4D 100c00067 PUD 100e5a067 PMD 105ed7067 PTE 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 0 PID: 0 Comm: swapper/0 Tainted: G OE ------- --- 6.7.0-68.fc38.x8664 #1 Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0013.121520200651 12/15/2020 RIP: 0010:ioread32be+0x34/0x60 RSP: 0018:ffffa26480003e58 EFLAGS: 00010292 RAX: ffffa26487064200 RBX: ffff9042d08161a0 RCX: ffff904c108222c0 RDX: 000000010bbf1b80 RSI: ffffffffc055ddb0 RDI: ffffa26487064230 RBP: ffff9042d08161a0 R08: 0000000000000022 R09: ffff904c108222e8 R10: 0000000000000004 R11: 0000000000000441 R12: ffffffffc055ddb0 R13: ffffa26487064200 R14: ffffa26480003f00 R15: ffff904c108222c0 FS: 0000000000000000(0000) GS:ffff904c10800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffa26487064230 CR3: 00000002c4420006 CR4: 00000000007706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> ? _die+0x23/0x70 ? pagefaultoops+0x171/0x4e0 ? excpagefault+0x175/0x180 ? asmexcpagefault+0x26/0x30 ? _pfxpollhealth+0x10/0x10 [mlx5core] ? _pfxpollhealth+0x10/0x10 [mlx5core] ? ioread32be+0x34/0x60 mlx5healthcheckfatalsensors+0x20/0x100 [mlx5core] ? _pfxpollhealth+0x10/0x10 [mlx5core] pollhealth+0x42/0x230 [mlx5core] ? _nexttimerinterrupt+0xbc/0x110 ? _pfxpollhealth+0x10/0x10 [mlx5core] calltimerfn+0x21/0x130 ? _pfxpollhealth+0x10/0x10 [mlx5core] _runtimers+0x222/0x2c0 runtimersoftirq+0x1d/0x40 _dosoftirq+0xc9/0x2c8 _irqexitrcu+0xa6/0xc0 sysvecapictimerinterrupt+0x72/0x90 </IRQ> <TASK> asmsysvecapictimerinterrupt+0x1a/0x20 RIP: 0010:cpuidleenterstate+0xcc/0x440 ? cpuidleenterstate+0xbd/0x440 cpuidleenter+0x2d/0x40 doidle+0x20d/0x270 cpustartupentry+0x2a/0x30 restinit+0xd0/0xd0 archcallrestinit+0xe/0x30 startkernel+0x709/0xa90 x8664startreservations+0x18/0x30 x8664startkernel+0x96/0xa0 secondarystartup64noverify+0x18f/0x19b ---[ end trace 0000000000000000 ]---