In the Linux kernel, the following vulnerability has been resolved:
cgroup/cpuset: fix panic caused by partcmd_update
We find a bug as below: BUG: unable to handle page fault for address: 00000003 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 3 PID: 358 Comm: bash Tainted: G W I 6.6.0-10893-g60d6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/4 RIP: 0010:partitionscheddomainslocked+0x483/0x600 Code: 01 48 85 d2 74 0d 48 83 05 29 3f f8 03 01 f3 48 0f bc c2 89 c0 48 9 RSP: 0018:ffffc90000fdbc58 EFLAGS: 00000202 RAX: 0000000100000003 RBX: ffff888100b3dfa0 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000002fe80 RBP: ffff888100b3dfb0 R08: 0000000000000001 R09: 0000000000000000 R10: ffffc90000fdbcb0 R11: 0000000000000004 R12: 0000000000000002 R13: ffff888100a92b48 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f44a5425740(0000) GS:ffff888237d80000(0000) knlGS:0000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000100030973 CR3: 000000010722c000 CR4: 00000000000006e0 Call Trace: <TASK> ? showregs+0x8c/0xa0 ? _diebody+0x23/0xa0 ? _die+0x3a/0x50 ? pagefaultoops+0x1d2/0x5c0 ? partitionscheddomainslocked+0x483/0x600 ? searchmoduleextables+0x2a/0xb0 ? searchexceptiontables+0x67/0x90 ? kernelmodefixuporoops+0x144/0x1b0 ? _badareanosemaphore+0x211/0x360 ? upread+0x3b/0x50 ? badareanosemaphore+0x1a/0x30 ? excpagefault+0x890/0xd90 ? _lockacquire.constprop.0+0x24f/0x8d0 ? _lockacquire.constprop.0+0x24f/0x8d0 ? asmexcpagefault+0x26/0x30 ? partitionscheddomainslocked+0x483/0x600 ? partitionscheddomainslocked+0xf0/0x600 rebuildscheddomainslocked+0x806/0xdc0 updatepartitionsdlb+0x118/0x130 cpusetwriteresmask+0xffc/0x1420 cgroupfilewrite+0xb2/0x290 kernfsfopwriteiter+0x194/0x290 newsyncwrite+0xeb/0x160 vfswrite+0x16f/0x1d0 ksyswrite+0x81/0x180 _x64syswrite+0x21/0x30 x64syscall+0x2f25/0x4630 dosyscall64+0x44/0xb0 entrySYSCALL64afterhwframe+0x78/0xe2 RIP: 0033:0x7f44a553c887
It can be reproduced with cammands: cd /sys/fs/cgroup/ mkdir test cd test/ echo +cpuset > ../cgroup.subtree_control echo root > cpuset.cpus.partition cat /sys/fs/cgroup/cpuset.cpus.effective 0-3 echo 0-3 > cpuset.cpus // taking away all cpus from root
This issue is caused by the incorrect rebuilding of scheduling domains. In this scenario, test/cpuset.cpus.partition should be an invalid root and should not trigger the rebuilding of scheduling domains. When calling updateparenteffectivecpumask with partcmdupdate, if newmask is not null, it should recheck newmask whether there are cpus is available for parect/cs that has tasks.