In the Linux kernel, the following vulnerability has been resolved:
nfs: fix panic when nfs4fflayoutprepareds() fails
We've been seeing the following panic in production
BUG: kernel NULL pointer dereference, address: 0000000000000065 PGD 2f485f067 P4D 2f485f067 PUD 2cc5d8067 PMD 0 RIP: 0010:fflayoutcancelio+0x3a/0x90 [nfslayoutflexfiles] Call Trace: <TASK> ? _die+0x78/0xc0 ? pagefaultoops+0x286/0x380 ? _rpcexecute+0x2c3/0x470 [sunrpc] ? rpcnewtask+0x42/0x1c0 [sunrpc] ? excpagefault+0x5d/0x110 ? asmexcpagefault+0x22/0x30 ? fflayoutfreelayoutreturn+0x110/0x110 [nfslayoutflexfiles] ? fflayoutcancelio+0x3a/0x90 [nfslayoutflexfiles] ? fflayoutcancelio+0x6f/0x90 [nfslayoutflexfiles] pnfsmarkmatchinglsegsreturn+0x1b0/0x360 [nfsv4] pnfserrormarklayoutforreturn+0x9e/0x110 [nfsv4] ? fflayoutsendlayouterror+0x50/0x160 [nfslayoutflexfiles] nfs4fflayoutprepareds+0x11f/0x290 [nfslayoutflexfiles] fflayoutpginitwrite+0xf0/0x1f0 [nfslayoutflexfiles] _nfspageioaddrequest+0x154/0x6c0 [nfs] nfspageioaddrequest+0x26b/0x380 [nfs] nfsdowritepage+0x111/0x1e0 [nfs] nfswritepagescallback+0xf/0x30 [nfs] writecachepages+0x17f/0x380 ? nfspageioinitwrite+0x50/0x50 [nfs] ? nfswritepages+0x6d/0x210 [nfs] ? nfswritepages+0x6d/0x210 [nfs] nfswritepages+0x125/0x210 [nfs] dowritepages+0x67/0x220 ? genericperformwrite+0x14b/0x210 filemapfdatawritewbc+0x5b/0x80 filewriteandwaitrange+0x6d/0xc0 nfsfilefsync+0x81/0x170 [nfs] ? nfsfilemmap+0x60/0x60 [nfs] _x64sysfsync+0x53/0x90 dosyscall64+0x3d/0x90 entrySYSCALL64after_hwframe+0x46/0xb0
Inspecting the core with drgn I was able to pull this
prog.crashedthread().stacktrace()[0] #0 at 0xffffffffa079657a (fflayoutcancelio+0x3a/0x84) in fflayoutcancelio at fs/nfs/flexfilelayout/flexfilelayout.c:2021:27 prog.crashedthread().stacktrace()[0]['idx'] (u32)1 prog.crashedthread().stacktrace()[0]['flseg'].mirrorarray[1].mirrords (struct nfs4fflayout_ds *)0xffffffffffffffed
This is clear from the stack trace, we call nfs4fflayoutprepareds() which could error out initializing the mirrords, and then we go to clean it all up and our check is only for if (!mirror->mirrords). This is inconsistent with the rest of the users of mirror_ds, which have
if (ISERRORNULL(mirrords))
to keep from tripping over this exact scenario. Fix this up in fflayoutcancelio() to make sure we don't panic when we get an error. I also spot checked all the other instances of checking mirrords and we appear to be doing the correct checks everywhere, only unconditionally dereferencing mirror_ds when we know it would be valid.