In the Linux kernel, the following vulnerability has been resolved:
btrfs: fix double free of anonymous device after snapshot creation failure
When creating a snapshot we may do a double free of an anonymous device in case there's an error committing the transaction. The second free may result in freeing an anonymous device number that was allocated by some other subsystem in the kernel or another btrfs filesystem.
The steps that lead to this:
1) At ioctl.c:createsnapshot() we allocate an anonymous device number and assign it to pendingsnapshot->anon_dev;
2) Then we call btrfscommittransaction() and end up at transaction.c:creatependingsnapshot();
3) There we call btrfsgetnewfsroot() and pass it the anonymous device number stored in pendingsnapshot->anondev;
4) btrfsgetnewfsroot() frees that anonymous device number because btrfslookupfs_root() returned a root - someone else did a lookup of the new root already, which could some task doing backref walking;
5) After that some error happens in the transaction commit path, and at ioctl.c:createsnapshot() we jump to the 'fail' label, and after that we free again the same anonymous device number, which in the meanwhile may have been reallocated somewhere else, because pendingsnapshot->anon_dev still has the same value as in step 1.
Recently syzbot ran into this and reported the following trace:
------------[ cut here ]------------ idafree called for id=51 which is not allocated. WARNING: CPU: 1 PID: 31038 at lib/idr.c:525 idafree+0x370/0x420 lib/idr.c:525 Modules linked in: CPU: 1 PID: 31038 Comm: syz-executor.2 Not tainted 6.8.0-rc4-syzkaller-00410-gc02197fc9076 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024 RIP: 0010:idafree+0x370/0x420 lib/idr.c:525 Code: 10 42 80 3c 28 (...) RSP: 0018:ffffc90015a67300 EFLAGS: 00010246 RAX: be5130472f5dd000 RBX: 0000000000000033 RCX: 0000000000040000 RDX: ffffc90009a7a000 RSI: 000000000003ffff RDI: 0000000000040000 RBP: ffffc90015a673f0 R08: ffffffff81577992 R09: 1ffff92002b4cdb4 R10: dffffc0000000000 R11: fffff52002b4cdb5 R12: 0000000000000246 R13: dffffc0000000000 R14: ffffffff8e256b80 R15: 0000000000000246 FS: 00007fca3f4b46c0(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f167a17b978 CR3: 000000001ed26000 CR4: 0000000000350ef0 Call Trace: <TASK> btrfsgetrootref+0xa48/0xaf0 fs/btrfs/disk-io.c:1346 creatependingsnapshot+0xff2/0x2bc0 fs/btrfs/transaction.c:1837 creatependingsnapshots+0x195/0x1d0 fs/btrfs/transaction.c:1931 btrfscommittransaction+0xf1c/0x3740 fs/btrfs/transaction.c:2404 createsnapshot+0x507/0x880 fs/btrfs/ioctl.c:848 btrfsmksubvol+0x5d0/0x750 fs/btrfs/ioctl.c:998 btrfsmksnapshot+0xb5/0xf0 fs/btrfs/ioctl.c:1044 _btrfsioctlsnapcreate+0x387/0x4b0 fs/btrfs/ioctl.c:1306 btrfsioctlsnapcreatev2+0x1ca/0x400 fs/btrfs/ioctl.c:1393 btrfsioctl+0xa74/0xd40 vfsioctl fs/ioctl.c:51 [inline] _dosysioctl fs/ioctl.c:871 [inline] _sesysioctl+0xfe/0x170 fs/ioctl.c:857 dosyscall64+0xfb/0x240 entrySYSCALL64afterhwframe+0x6f/0x77 RIP: 0033:0x7fca3e67dda9 Code: 28 00 00 00 (...) RSP: 002b:00007fca3f4b40c8 EFLAGS: 00000246 ORIGRAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007fca3e7abf80 RCX: 00007fca3e67dda9 RDX: 00000000200005c0 RSI: 0000000050009417 RDI: 0000000000000003 RBP: 00007fca3e6ca47a R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 000000000000000b R14: 00007fca3e7abf80 R15: 00007fff6bf95658 </TASK>
Where we get an explicit message where we attempt to free an anonymous device number that is not currently allocated. It happens in a different code path from the example below, at btrfsgetroot_ref(), so this change may not fix the case triggered by sy ---truncated---