In the Linux kernel, the following vulnerability has been resolved:
nilfs2: fix deadlock in nilfscountfree_blocks()
A semaphore deadlock can occur if nilfsgetblock() detects metadata corruption while locating data blocks and a superblock writeback occurs at the same time:
task 1 task 2 ------ ------ * A file operation * nilfstruncate() nilfsgetblock() downread(rwsem A) <-- nilfsbmaplookupcontig() ... genericshutdownsuper() nilfsputsuper() * Prepare to write superblock * downwrite(rwsem B) <-- nilfscleanupsuper() * Detect b-tree corruption * nilfssetlogcursor() nilfsbmapconverterror() nilfscountfreeblocks() _nilfserror() downread(rwsem A) <-- nilfsseterror() down_write(rwsem B) <--
*** DEADLOCK ***
Here, nilfsgetblock() readlocks rwsem A (= NILFSMDT(datinode)->misem) and then calls nilfsbmaplookupcontig(), but if it fails due to metadata corruption, _nilfserror() is called from nilfsbmapconvert_error() inside the lock section.
Since _nilfserror() calls nilfsseterror() unless the filesystem is read-only and nilfsseterror() attempts to writelock rwsem B (= nilfs->ns_sem) to write back superblock exclusively, hierarchical lock acquisition occurs in the order rwsem A -> rwsem B.
Now, if another task starts updating the superblock, it may writelock rwsem B during the lock sequence above, and can deadlock trying to readlock rwsem A in nilfscountfree_blocks().
However, there is actually no need to take rwsem A in nilfscountfreeblocks() because it, within the lock section, only reads a single integer data on a shared struct with nilfssufilegetncleansegs(). This has been the case after commit aa474a220180 ("nilfs2: add local variable to cache the number of clean segments"), that is, even before this bug was introduced.
So, this resolves the deadlock problem by just not taking the semaphore in nilfscountfree_blocks().