In the Linux kernel, the following vulnerability has been resolved:
ARM: 9381/1: kasan: clear stale stack poison
We found below OOB crash:
[ 33.452494] ================================================================== [ 33.453513] BUG: KASAN: stack-out-of-bounds in refreshcpuvmstats.constprop.0+0xcc/0x2ec [ 33.454660] Write of size 164 at addr c1d03d30 by task swapper/0/0 [ 33.455515] [ 33.455767] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 6.1.25-mainline #1 [ 33.456880] Hardware name: Generic DT based system [ 33.457555] unwindbacktrace from showstack+0x18/0x1c [ 33.458326] showstack from dumpstacklvl+0x40/0x4c [ 33.459072] dumpstacklvl from printreport+0x158/0x4a4 [ 33.459863] printreport from kasanreport+0x9c/0x148 [ 33.460616] kasanreport from kasancheckrange+0x94/0x1a0 [ 33.461424] kasancheckrange from memset+0x20/0x3c [ 33.462157] memset from refreshcpuvmstats.constprop.0+0xcc/0x2ec [ 33.463064] refreshcpuvmstats.constprop.0 from ticknohzidlestoptick+0x180/0x53c [ 33.464181] ticknohzidlestoptick from doidle+0x264/0x354 [ 33.465029] doidle from cpustartupentry+0x20/0x24 [ 33.465769] cpustartupentry from restinit+0xf0/0xf4 [ 33.466528] restinit from archpostacpisubsysinit+0x0/0x18 [ 33.467397] [ 33.467644] The buggy address belongs to stack of task swapper/0/0 [ 33.468493] and is located at offset 112 in frame: [ 33.469172] refreshcpuvmstats.constprop.0+0x0/0x2ec [ 33.469917] [ 33.470165] This frame has 2 objects: [ 33.470696] [32, 76) 'globalzonediff' [ 33.470729] [112, 276) 'globalnode_diff' [ 33.471294] [ 33.472095] The buggy address belongs to the physical page: [ 33.472862] page:3cd72da8 refcount:1 mapcount:0 mapping:00000000 index:0x0 pfn:0x41d03 [ 33.473944] flags: 0x1000(reserved|zone=0) [ 33.474565] raw: 00001000 ed741470 ed741470 00000000 00000000 00000000 ffffffff 00000001 [ 33.475656] raw: 00000000 [ 33.476050] page dumped because: kasan: bad access detected [ 33.476816] [ 33.477061] Memory state around the buggy address: [ 33.477732] c1d03c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 33.478630] c1d03c80: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00 [ 33.479526] >c1d03d00: 00 04 f2 f2 f2 f2 00 00 00 00 00 00 f1 f1 f1 f1 [ 33.480415] ^ [ 33.481195] c1d03d80: 00 00 00 00 00 00 00 00 00 00 04 f3 f3 f3 f3 f3 [ 33.482088] c1d03e00: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 [ 33.482978] ==================================================================
We find the root cause of this OOB is that arm does not clear stale stack poison in the case of cpuidle.
This patch refer to arch/arm64/kernel/sleep.S to resolve this issue.
From cited commit [1] that explain the problem
Functions which the compiler has instrumented for KASAN place poison on the stack shadow upon entry and remove this poison prior to returning.
In the case of cpuidle, CPUs exit the kernel a number of levels deep in C code. Any instrumented functions on this critical path will leave portions of the stack shadow poisoned.
If CPUs lose context and return to the kernel via a cold path, we restore a prior context saved in _cpususpend_enter are forgotten, and we never remove the poison they placed in the stack shadow area by functions calls between this and the actual exit of the kernel.
Thus, (depending on stackframe layout) subsequent calls to instrumented functions may hit this stale poison, resulting in (spurious) KASAN splats to the console.
To avoid this, clear any stale poison from the idle thread for a CPU prior to bringing a CPU online.
From cited commit [2]
Extend to check for CONFIGKASANSTACK
[1] commit 0d97e6d8024c ("arm64: kasan: clear stale stack poison") [2] commit d56a9ef84bd0 ("kasan, arm64: unpoison stack only with CONFIGKASANSTACK")