In the Linux kernel, the following vulnerability has been resolved:
ice: Fix race condition during interface enslave
Commit 5dbbbd01cbba83 ("ice: Avoid RTNL lock when re-creating auxiliary device") changes a process of re-creation of aux device so iceplugauxdev() is called from iceservice_task() context. This unfortunately opens a race window that can result in dead-lock when interface has left LAG and immediately enters LAG again.
Reproducer:
#!/bin/sh
ip link add lag0 type bond mode 1 miimon 100
ip link set lag0
for n in {1..10}; do
echo Cycle: $n
ip link set ens7f0 master lag0
sleep 1
ip link set ens7f0 nomaster
done
This results in: [20976.208697] Workqueue: ice iceservicetask [ice] [20976.213422] Call Trace: [20976.215871] schedule+0x2d1/0x830 [20976.219364] schedule+0x35/0xa0 [20976.222510] schedulepreemptdisabled+0xa/0x10 [20976.227043] _mutexlock.isra.7+0x310/0x420 [20976.235071] enumallgidsofdevcb+0x1c/0x100 [ibcore] [20976.251215] ibenumrocenetdev+0xa4/0xe0 [ibcore] [20976.256192] ibcachesetupone+0x33/0xa0 [ibcore] [20976.261079] ibregisterdevice+0x40d/0x580 [ibcore] [20976.266139] irdmaibregisterdevice+0x129/0x250 [irdma] [20976.281409] irdmaprobe+0x2c1/0x360 [irdma] [20976.285691] auxiliarybusprobe+0x45/0x70 [20976.289790] reallyprobe+0x1f2/0x480 [20976.298509] driverprobedevice+0x49/0xc0 [20976.302609] busforeachdrv+0x79/0xc0 [20976.306448] _deviceattach+0xdc/0x160 [20976.310286] busprobedevice+0x9d/0xb0 [20976.314128] deviceadd+0x43c/0x890 [20976.321287] _auxiliarydeviceadd+0x43/0x60 [20976.325644] iceplugauxdev+0xb2/0x100 [ice] [20976.330109] iceservicetask+0xd0c/0xed0 [ice] [20976.342591] processonework+0x1a7/0x360 [20976.350536] workerthread+0x30/0x390 [20976.358128] kthread+0x10a/0x120 [20976.365547] retfromfork+0x1f/0x40 ... [20976.438030] task:ip state:D stack: 0 pid:213658 ppid:213627 flags:0x00004084 [20976.446469] Call Trace: [20976.448921] _schedule+0x2d1/0x830 [20976.452414] schedule+0x35/0xa0 [20976.455559] schedulepreemptdisabled+0xa/0x10 [20976.460090] _mutexlock.isra.7+0x310/0x420 [20976.464364] devicedel+0x36/0x3c0 [20976.467772] iceunplugauxdev+0x1a/0x40 [ice] [20976.472313] icelageventhandler+0x2a2/0x520 [ice] [20976.477288] notifiercallchain+0x47/0x70 [20976.481386] _netdevupperdevlink+0x18b/0x280 [20976.489845] bondenslave+0xe05/0x1790 [bonding] [20976.494475] dosetlink+0x336/0xf50 [20976.502517] _rtnlnewlink+0x529/0x8b0 [20976.543441] rtnlnewlink+0x43/0x60 [20976.546934] rtnetlinkrcvmsg+0x2b1/0x360 [20976.559238] netlinkrcvskb+0x4c/0x120 [20976.563079] netlinkunicast+0x196/0x230 [20976.567005] netlinksendmsg+0x204/0x3d0 [20976.570930] socksendmsg+0x4c/0x50 [20976.574423] _syssendmsg+0x1eb/0x250 [20976.586807] _syssendmsg+0x7c/0xc0 [20976.606353] _syssendmsg+0x57/0xa0 [20976.609930] dosyscall64+0x5b/0x1a0 [20976.613598] entrySYSCALL64afterhwframe+0x65/0xca
The patch fixes this issue by following changes: - Bit ICEFLAGPLUGAUXDEV is kept to be set during iceplugauxdev() call in iceservicetask() - The bit is checked in iceclearrdmacap() and only if it is not set then iceunplugauxdev() is called. If it is set (in other words plugging of aux device was requested and iceplugauxdev() is potentially running) then the function only clears the ---truncated---