In the Linux kernel, the following vulnerability has been resolved:
igb: Do not bring the device up after non-fatal error
Commit 004d25060c78 ("igb: Fix igbdown hung on surprise removal") changed igbioerrordetected() to ignore non-fatal pcie errors in order to avoid hung task that can happen when igbdown() is called multiple times. This caused an issue when processing transient non-fatal errors. igbioresume(), which is called after igbioerrordetected(), assumes that device is brought down by igbioerror_detected() if the interface is up. This resulted in panic with stacktrace below.
[ T3256] igb 0000:09:00.0 haeth0: igb: haeth0 NIC Link is Down [ T292] pcieport 0000:00:1c.5: AER: Uncorrected (Non-Fatal) error received: 0000:09:00.0 [ T292] igb 0000:09:00.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID) [ T292] igb 0000:09:00.0: device [8086:1537] error status/mask=00004000/00000000 [ T292] igb 0000:09:00.0: [14] CmpltTO [ 200.105524,009][ T292] igb 0000:09:00.0: AER: TLP Header: 00000000 00000000 00000000 00000000 [ T292] pcieport 0000:00:1c.5: AER: broadcast errordetected message [ T292] igb 0000:09:00.0: Non-correctable non-fatal error reported. [ T292] pcieport 0000:00:1c.5: AER: broadcast mmioenabled message [ T292] pcieport 0000:00:1c.5: AER: broadcast resume message [ T292] ------------[ cut here ]------------ [ T292] kernel BUG at net/core/dev.c:6539! [ T292] invalid opcode: 0000 [#1] PREEMPT SMP [ T292] RIP: 0010:napienable+0x37/0x40 [ T292] Call Trace: [ T292] <TASK> [ T292] ? die+0x33/0x90 [ T292] ? dotrap+0xdc/0x110 [ T292] ? napienable+0x37/0x40 [ T292] ? doerrortrap+0x70/0xb0 [ T292] ? napienable+0x37/0x40 [ T292] ? napienable+0x37/0x40 [ T292] ? excinvalidop+0x4e/0x70 [ T292] ? napienable+0x37/0x40 [ T292] ? asmexcinvalidop+0x16/0x20 [ T292] ? napienable+0x37/0x40 [ T292] igbup+0x41/0x150 [ T292] igbioresume+0x25/0x70 [ T292] reportresume+0x54/0x70 [ T292] ? reportfrozendetected+0x20/0x20 [ T292] pciwalkbus+0x6c/0x90 [ T292] ? aerprintportinfo+0xa0/0xa0 [ T292] pciedorecovery+0x22f/0x380 [ T292] aerprocesserrdevices+0x110/0x160 [ T292] aerisr+0x1c1/0x1e0 [ T292] ? disableirqnosync+0x10/0x10 [ T292] irqthreadfn+0x1a/0x60 [ T292] irqthread+0xe3/0x1a0 [ T292] ? irqsetaffinitynotifier+0x120/0x120 [ T292] ? irqaffinitynotify+0x100/0x100 [ T292] kthread+0xe2/0x110 [ T292] ? kthreadcompleteandexit+0x20/0x20 [ T292] retfromfork+0x2d/0x50 [ T292] ? kthreadcompleteandexit+0x20/0x20 [ T292] retfromforkasm+0x11/0x20 [ T292] </TASK>
To fix this issue igbioresume() checks if the interface is running and the device is not down this means igbioerror_detected() did not bring the device down and there is no need to bring it up.