In the Linux kernel, the following vulnerability has been resolved:
workqueue: Fix selection of wakecpu in kickpool()
With cpupossiblemask=0-63 and cpuonlinemask=0-7 the following kernel oops was observed:
smp: Bringing up secondary CPUs ... smp: Brought up 1 node, 8 CPUs Unable to handle kernel pointer dereference in virtual kernel address space Failing address: 0000000000000000 TEID: 0000000000000803 [..] Call Trace: archvcpuispreempted+0x12/0x80 selectidlesibling+0x42/0x560 selecttaskrqfair+0x29a/0x3b0 trytowakeup+0x38e/0x6e0 kickpool+0xa4/0x198 _queuework.part.0+0x2bc/0x3a8 calltimerfn+0x36/0x160 _runtimers+0x1e2/0x328 _runtimerbase+0x5a/0x88 runtimersoftirq+0x40/0x78 _dosoftirq+0x118/0x388 irqexitrcu+0xc0/0xd8 doextirq+0xae/0x168 extinthandler+0xbe/0xf0 pswidleexit+0x0/0xc defaultidlecall+0x3c/0x110 doidle+0xd4/0x158 cpustartupentry+0x40/0x48 restinit+0xc6/0xc8 startkernel+0x3c4/0x5e0 startup_continue+0x3c/0x50
The crash is caused by calling archvcpuispreempted() for an offline CPU. To avoid this, select the cpu with cpumaskanyanddistribute() to mask _podcpumask with cpuonlinemask. In case no cpu is left in the pool, skip the assignment.
tj: This doesn't fully fix the bug as CPUs can still go down between picking the target CPU and the wake call. Fixing that likely requires adding cpu_online() test to either the sched or s390 arch code. However, regardless of how that is fixed, workqueue shouldn't be picking a CPU which isn't online as that would result in unpredictable and worse behavior.