In the Linux kernel, the following vulnerability has been resolved:
vfs: fix race between eviceinodes() and findinode()&iput()
Hi, all
Recently I noticed a bug[1] in btrfs, after digged it into and I believe it'a race in vfs.
Let's assume there's a inode (ie ino 261) with icount 1 is called by iput(), and there's a concurrent thread calling genericshutdown_super().
cpu0: cpu1: iput() // icount is 1 ->spinlock(inode) ->dec icount to 0 ->iputfinal() genericshutdownsuper() ->_inodeaddlru() ->evictinodes() // cause some reason[2] ->if (atomicread(inode->icount)) continue; // return before // inode 261 passed the above check // listlruaddobj() // and then schedule out ->spinunlock() // note here: the inode 261 // was still at sb list and hash list, // and IFREEING|IWILL_FREE was not been set
btrfsiget() // after some function calls ->findinode() // found the above inode 261 ->spinlock(inode) // check IFREEING|IWILLFREE // and passed ->_iget() ->spinunlock(inode) // schedule back ->spinlock(inode) // check (INEW|IFREEING|IWILLFREE) flags, // passed and set IFREEING iput() ->spinunlock(inode) ->spinlock(inode) ->evict() // dec icount to 0 ->iputfinal() ->spin_unlock() ->evict()
Now, we have two threads simultaneously evicting the same inode, which may trigger the BUG(inode->istate & ICLEAR) statement both within clear_inode() and iput().
To fix the bug, recheck the inode->icount after holding ilock. Because in the most scenarios, the first check is valid, and the overhead of spin_lock() can be reduced.
If there is any misunderstanding, please let me know, thanks.
return false when I reproduced the bug.