In the Linux kernel, the following vulnerability has been resolved:
btrfs: fix tree mod log mishandling of reallocated nodes
We have been seeing the following panic in production
kernel BUG at fs/btrfs/tree-mod-log.c:677! invalid opcode: 0000 [#1] SMP RIP: 0010:treemodlogrewind+0x1b4/0x200 RSP: 0000:ffffc9002c02f890 EFLAGS: 00010293 RAX: 0000000000000003 RBX: ffff8882b448c700 RCX: 0000000000000000 RDX: 0000000000008000 RSI: 00000000000000a7 RDI: ffff88877d831c00 RBP: 0000000000000002 R08: 000000000000009f R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000100c40 R12: 0000000000000001 R13: ffff8886c26d6a00 R14: ffff88829f5424f8 R15: ffff88877d831a00 FS: 00007fee1d80c780(0000) GS:ffff8890400c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fee1963a020 CR3: 0000000434f33002 CR4: 00000000007706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: btrfsgetoldroot+0x12b/0x420 btrfssearcholdslot+0x64/0x2f0 ? treemodlogoldestroot+0x3d/0xf0 resolveindirectref+0xfd/0x660 ? ulistalloc+0x31/0x60 ? kmemcachealloctrace+0x114/0x2c0 findparentnodes+0x97a/0x17e0 ? ulistalloc+0x30/0x60 btrfsfindallrootssafe+0x97/0x150 iterateextentinodes+0x154/0x370 ? btrfssearchpathintree+0x240/0x240 iterateinodesfromlogical+0x98/0xd0 ? btrfssearchpathintree+0x240/0x240 btrfsioctllogicaltoino+0xd9/0x180 btrfsioctl+0xe2/0x2ec0 ? _modmemcglruvecstate+0x3d/0x280 ? dosysopenat2+0x6d/0x140 ? kretprobedispatcher+0x47/0x70 ? kretproberethookhandler+0x38/0x50 ? rethooktrampolinehandler+0x82/0x140 ? archrethooktrampolinecallback+0x3b/0x50 ? kmemcachefree+0xfb/0x270 ? dosysopenat2+0xd5/0x140 _x64sysioctl+0x71/0xb0 dosyscall_64+0x2d/0x40
Which is this code in treemodlog_rewind()
switch (tm->op) {
case BTRFS_MOD_LOG_KEY_REMOVE_WHILE_FREEING:
BUG_ON(tm->slot < n);
This occurs because we replay the nodes in order that they happened, and when we do a REPLACE we will log a REMOVEWHILEFREEING for every slot, starting at 0. 'n' here is the number of items in this block, which in this case was 1, but we had 2 REMOVEWHILEFREEING operations.
The actual root cause of this was that we were replaying operations for a block that shouldn't have been replayed. Consider the following sequence of events
The tree mod log looks something like this
logical 0 op KEY_REPLACE (slot 1) seq 2
logical 0 op KEY_REMOVE (slot 1) seq 3
logical 0 op KEY_REMOVE_WHILE_FREEING (slot 0) seq 4
logical 4096 op LOG_ROOT_REPLACE (old logical 0) seq 5
logical 8192 op KEY_REMOVE_WHILE_FREEING (slot 1) seq 6
logical 8192 op KEY_REMOVE_WHILE_FREEING (slot 0) seq 7
logical 0 op LOG_ROOT_REPLACE (old logical 8192) seq 8
From here the bug is triggered by the following steps