In the Linux kernel, the following vulnerability has been resolved:
net, neigh: Do not trigger immediate probes on NUDFAILED from neighmanaged_work
syzkaller was able to trigger a deadlock for NTF_MANAGED entries [0]:
kworker/0:16/14617 is trying to acquire lock: ffffffff8d4dd370 (&tbl->lock){++-.}-{2:2}, at: _neighcreate+0x9e1/0x2990 net/core/neighbour.c:652 [...] but task is already holding lock: ffffffff8d4dd370 (&tbl->lock){++-.}-{2:2}, at: neighmanaged_work+0x35/0x250 net/core/neighbour.c:1572
The neighbor entry turned to NUDFAILED state, where _neigheventsend() triggered an immediate probe as per commit cd28ca0a3dd1 ("neigh: reduce arp latency") via neigh_probe() given table lock was held.
One option to fix this situation is to defer the neighprobe() back to the neightimerhandler() similarly as pre cd28ca0a3dd1. For the case of NTFMANAGED, this deferral is acceptable given this only happens on actual failure state and regular / expected state is NUD_VALID with the entry already present.
The fix adds a parameter to _neigheventsend() in order to communicate whether immediate probe is allowed or disallowed. Existing call-sites of neigheventsend() default as-is to immediate probe. However, the neighmanagedwork() disables it via use of neigheventsendprobe().
[0] <TASK> dumpstack lib/dumpstack.c:88 [inline] dumpstacklvl+0xcd/0x134 lib/dumpstack.c:106 printdeadlockbug kernel/locking/lockdep.c:2956 [inline] checkdeadlock kernel/locking/lockdep.c:2999 [inline] validatechain kernel/locking/lockdep.c:3788 [inline] _lockacquire.cold+0x149/0x3ab kernel/locking/lockdep.c:5027 lockacquire kernel/locking/lockdep.c:5639 [inline] lockacquire+0x1ab/0x510 kernel/locking/lockdep.c:5604 _rawwritelockbh include/linux/rwlockapismp.h:202 [inline] _rawwritelockbh+0x2f/0x40 kernel/locking/spinlock.c:334 neighcreate+0x9e1/0x2990 net/core/neighbour.c:652 ip6finishoutput2+0x1070/0x14f0 net/ipv6/ip6output.c:123 _ip6finishoutput net/ipv6/ip6output.c:191 [inline] _ip6finishoutput+0x61e/0xe90 net/ipv6/ip6output.c:170 ip6finishoutput+0x32/0x200 net/ipv6/ip6output.c:201 NFHOOKCOND include/linux/netfilter.h:296 [inline] ip6output+0x1e4/0x530 net/ipv6/ip6output.c:224 dstoutput include/net/dst.h:451 [inline] NFHOOK include/linux/netfilter.h:307 [inline] ndiscsendskb+0xa99/0x17f0 net/ipv6/ndisc.c:508 ndiscsendns+0x3a9/0x840 net/ipv6/ndisc.c:650 ndiscsolicit+0x2cd/0x4f0 net/ipv6/ndisc.c:742 neighprobe+0xc2/0x110 net/core/neighbour.c:1040 _neigheventsend+0x37d/0x1570 net/core/neighbour.c:1201 neigheventsend include/net/neighbour.h:470 [inline] neighmanagedwork+0x162/0x250 net/core/neighbour.c:1574 processonework+0x9ac/0x1650 kernel/workqueue.c:2307 workerthread+0x657/0x1110 kernel/workqueue.c:2454 kthread+0x2e9/0x3a0 kernel/kthread.c:377 retfromfork+0x1f/0x30 arch/x86/entry/entry_64.S:295 </TASK>