In the Linux kernel, the following vulnerability has been resolved:
net: vlan: don't propagate flags on open
With the device instance lock, there is now a possibility of a deadlock:
[ 1.211455] ============================================ [ 1.211571] WARNING: possible recursive locking detected [ 1.211687] 6.14.0-rc5-01215-g032756b4ca7a-dirty #5 Not tainted [ 1.211823] -------------------------------------------- [ 1.211936] ip/184 is trying to acquire lock: [ 1.212032] ffff8881024a4c30 (&dev->lock){+.+.}-{4:4}, at: devsetallmulti+0x4e/0xb0 [ 1.212207] [ 1.212207] but task is already holding lock: [ 1.212332] ffff8881024a4c30 (&dev->lock){+.+.}-{4:4}, at: devopen+0x50/0xb0 [ 1.212487] [ 1.212487] other info that might help us debug this: [ 1.212626] Possible unsafe locking scenario: [ 1.212626] [ 1.212751] CPU0 [ 1.212815] ---- [ 1.212871] lock(&dev->lock); [ 1.212944] lock(&dev->lock); [ 1.213016] [ 1.213016] * DEADLOCK * [ 1.213016] [ 1.213143] May be due to missing lock nesting notation [ 1.213143] [ 1.213294] 3 locks held by ip/184: [ 1.213371] #0: ffffffff838b53e0 (rtnlmutex){+.+.}-{4:4}, at: rtnlnetslock+0x1b/0xa0 [ 1.213543] #1: ffffffff84e5fc70 (&net->rtnlmutex){+.+.}-{4:4}, at: rtnlnetslock+0x37/0xa0 [ 1.213727] #2: ffff8881024a4c30 (&dev->lock){+.+.}-{4:4}, at: devopen+0x50/0xb0 [ 1.213895] [ 1.213895] stack backtrace: [ 1.213991] CPU: 0 UID: 0 PID: 184 Comm: ip Not tainted 6.14.0-rc5-01215-g032756b4ca7a-dirty #5 [ 1.213993] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 [ 1.213994] Call Trace: [ 1.213995] <TASK> [ 1.213996] dumpstacklvl+0x8e/0xd0 [ 1.214000] printdeadlockbug+0x28b/0x2a0 [ 1.214020] lockacquire+0xea/0x2a0 [ 1.214027] mutexlock+0xbf/0xd40 [ 1.214038] devsetallmulti+0x4e/0xb0 # realdev->flags & IFFALLMULTI [ 1.214040] vlandevopen+0xa5/0x170 # ndoopen on vlandev [ 1.214042] _devopen+0x145/0x270 [ 1.214046] _devchangeflags+0xb0/0x1e0 [ 1.214051] netifchangeflags+0x22/0x60 # IFFUP vlandev [ 1.214053] devchangeflags+0x61/0xb0 # for each device in group from dev->vlaninfo [ 1.214055] vlandeviceevent+0x766/0x7c0 # on netdevsim0 [ 1.214058] notifiercallchain+0x78/0x120 [ 1.214062] netifopen+0x6d/0x90 [ 1.214064] devopen+0x5b/0xb0 # locks netdevsim0 [ 1.214066] bondenslave+0x64c/0x1230 [ 1.214075] dosetmaster+0x175/0x1e0 # on netdevsim0 [ 1.214077] dosetlink+0x516/0x13b0 [ 1.214094] rtnlnewlink+0xaba/0xb80 [ 1.214132] rtnetlinkrcvmsg+0x440/0x490 [ 1.214144] netlinkrcvskb+0xeb/0x120 [ 1.214150] netlinkunicast+0x1f9/0x320 [ 1.214153] netlinksendmsg+0x346/0x3f0 [ 1.214157] _socksendmsg+0x86/0xb0 [ 1.214160] _syssendmsg+0x1c8/0x220 [ 1.214164] _syssendmsg+0x28f/0x2d0 [ 1.214179] _x64syssendmsg+0xef/0x140 [ 1.214184] dosyscall64+0xec/0x1d0 [ 1.214190] entrySYSCALL64after_hwframe+0x77/0x7f [ 1.214191] RIP: 0033:0x7f2d1b4a7e56
Device setup:
netdevsim0 (down)
^ ^
bond netdevsim1.100@netdevsim1 allmulticast=on (down)
When we enslave the lower device (netdevsim0) which has a vlan, we propagate vlan's allmuti/promisc flags during ndoopen. This causes (re)locking on of the realdev.
Propagate allmulti/promisc on flags change, not on the open. There is a slight semantics change that vlans that are down now propagate the flags, but this seems unlikely to result in the real issues.
Reproducer:
echo 0 1 > /sys/bus/netdevsim/new_device
devpath=$(ls -d /sys/bus/netdevsim/devices/netdevsim0/net/*) dev=$(echo $devpath | rev | cut -d/ -f1 | rev)
ip link set dev $dev name netdevsim0 ip link set dev netdevsim0 up
ip link add link netdevsim0 name netdevsim0.100 type vlan id 100 ip link set dev netdevsim0.100 allm ---truncated---