In the Linux kernel, the following vulnerability has been resolved:
tcp: TX zerocopy should not sense pfmemalloc status
We got a recent syzbot report [1] showing a possible misuse of pfmemalloc page status in TCP zerocopy paths.
Indeed, for pages coming from user space or other layers, using pageispfmemalloc() is moot, and possibly could give false positives.
There has been attempts to make pageispfmemalloc() more robust, but not using it in the first place in this context is probably better, removing cpu cycles.
Note to stable teams :
You need to backport 84ce071e38a6 ("net: introduce _skbfillpagedesc_noacc") as a prereq.
Race is more probable after commit c07aea3ef4d4 ("mm: add a signature in struct page") because pageispfmemalloc() is now using low order bit from page->lru.next, which can change more often than page->index.
Low order bit should never be set for lru.next (when used as an anchor in LRU list), so KCSAN report is mostly a false positive.
Backporting to older kernel versions seems not necessary.
[1] BUG: KCSAN: data-race in lruaddfn / tcpbuildfrag
write to 0xffffea0004a1d2c8 of 8 bytes by task 18600 on cpu 0: _listadd include/linux/list.h:73 [inline] listadd include/linux/list.h:88 [inline] lruvecaddfolio include/linux/mminline.h:105 [inline] lruaddfn+0x440/0x520 mm/swap.c:228 foliobatchmovelru+0x1e1/0x2a0 mm/swap.c:246 foliobatchaddandmove mm/swap.c:263 [inline] folioaddlru+0xf1/0x140 mm/swap.c:490 filemapaddfolio+0xf8/0x150 mm/filemap.c:948 _filemapgetfolio+0x510/0x6d0 mm/filemap.c:1981 pagecachegetpage+0x26/0x190 mm/folio-compat.c:104 grabcachepagewritebegin+0x2a/0x30 mm/folio-compat.c:116 ext4dawritebegin+0x2dd/0x5f0 fs/ext4/inode.c:2988 genericperformwrite+0x1d4/0x3f0 mm/filemap.c:3738 ext4bufferedwriteiter+0x235/0x3e0 fs/ext4/file.c:270 ext4filewriteiter+0x2e3/0x1210 callwriteiter include/linux/fs.h:2187 [inline] newsyncwrite fs/readwrite.c:491 [inline] vfswrite+0x468/0x760 fs/readwrite.c:578 ksyswrite+0xe8/0x1a0 fs/readwrite.c:631 _dosyswrite fs/readwrite.c:643 [inline] _sesyswrite fs/readwrite.c:640 [inline] _x64syswrite+0x3e/0x50 fs/readwrite.c:640 dosyscallx64 arch/x86/entry/common.c:50 [inline] dosyscall64+0x2b/0x70 arch/x86/entry/common.c:80 entrySYSCALL64afterhwframe+0x63/0xcd
read to 0xffffea0004a1d2c8 of 8 bytes by task 18611 on cpu 1: pageispfmemalloc include/linux/mm.h:1740 [inline] _skbfillpagedesc include/linux/skbuff.h:2422 [inline] skbfillpagedesc include/linux/skbuff.h:2443 [inline] tcpbuildfrag+0x613/0xb20 net/ipv4/tcp.c:1018 dotcpsendpages+0x3e8/0xaf0 net/ipv4/tcp.c:1075 tcpsendpagelocked net/ipv4/tcp.c:1140 [inline] tcpsendpage+0x89/0xb0 net/ipv4/tcp.c:1150 inetsendpage+0x7f/0xc0 net/ipv4/afinet.c:833 kernelsendpage+0x184/0x300 net/socket.c:3561 socksendpage+0x5a/0x70 net/socket.c:1054 pipetosendpage+0x128/0x160 fs/splice.c:361 splicefrompipefeed fs/splice.c:415 [inline] _splicefrompipe+0x222/0x4d0 fs/splice.c:559 splicefrompipe fs/splice.c:594 [inline] genericsplicesendpage+0x89/0xc0 fs/splice.c:743 dosplicefrom fs/splice.c:764 [inline] directspliceactor+0x80/0xa0 fs/splice.c:931 splicedirecttoactor+0x305/0x620 fs/splice.c:886 dosplicedirect+0xfb/0x180 fs/splice.c:974 dosendfile+0x3bf/0x910 fs/readwrite.c:1249 _dosyssendfile64 fs/readwrite.c:1317 [inline] _sesyssendfile64 fs/readwrite.c:1303 [inline] _x64syssendfile64+0x10c/0x150 fs/readwrite.c:1303 dosyscallx64 arch/x86/entry/common.c:50 [inline] dosyscall64+0x2b/0x70 arch/x86/entry/common.c:80 entrySYSCALL64after_hwframe+0x63/0xcd
value changed: 0x0000000000000000 -> 0xffffea0004a1d288
Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 18611 Comm: syz-executor.4 Not tainted 6.0.0-rc2-syzkaller-00248-ge022620b5d05-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/22/2022