In the Linux kernel, the following vulnerability has been resolved:
book3s64/radix : Align section vmemmap start address to PAGE_SIZE
A vmemmap altmap is a device-provided region used to provide backing storage for struct pages. For each namespace, the altmap should belong to that same namespace. If the namespaces are created unaligned, there is a chance that the section vmemmap start address could also be unaligned. If the section vmemmap start address is unaligned, the altmap page allocated from the current namespace might be used by the previous namespace also. During the free operation, since the altmap is shared between two namespaces, the previous namespace may detect that the page does not belong to its altmap and incorrectly assume that the page is a normal page. It then attempts to free the normal page, which leads to a kernel crash.
Kernel attempted to read user page (18) - exploit attempt? (uid: 0) BUG: Kernel NULL pointer dereference on read at 0x00000018 Faulting instruction address: 0xc000000000530c7c Oops: Kernel access of bad area, sig: 11 [#1] LE PAGESIZE=64K MMU=Radix SMP NRCPUS=2048 NUMA pSeries CPU: 32 PID: 2104 Comm: ndctl Kdump: loaded Tainted: G W NIP: c000000000530c7c LR: c000000000530e00 CTR: 0000000000007ffe REGS: c000000015e57040 TRAP: 0300 Tainted: G W MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 84482404 CFAR: c000000000530dfc DAR: 0000000000000018 DSISR: 40000000 IRQMASK: 0 GPR00: c000000000530e00 c000000015e572e0 c000000002c5cb00 c00c000101008040 GPR04: 0000000000000000 0000000000000007 0000000000000001 000000000000001f GPR08: 0000000000000005 0000000000000000 0000000000000018 0000000000002000 GPR12: c0000000001d2fb0 c0000060de6b0080 0000000000000000 c0000060dbf90020 GPR16: c00c000101008000 0000000000000001 0000000000000000 c000000125b20f00 GPR20: 0000000000000001 0000000000000000 ffffffffffffffff c00c000101007fff GPR24: 0000000000000001 0000000000000000 0000000000000000 0000000000000000 GPR28: 0000000004040201 0000000000000001 0000000000000000 c00c000101008040 NIP [c000000000530c7c] getpfnblockflagsmask+0x7c/0xd0 LR [c000000000530e00] freeunrefpageprepare+0x130/0x4f0 Call Trace: freeunrefpage+0x50/0x1e0 freereservedpage+0x40/0x68 freevmemmappages+0x98/0xe0 removeptetable+0x164/0x1e8 removepmdtable+0x204/0x2c8 removepudtable+0x1c4/0x288 removepagetable+0x1c8/0x310 vmemmapfree+0x24/0x50 sectiondeactivate+0x28c/0x2a0 _removepages+0x84/0x110 archremovememory+0x38/0x60 memunmappages+0x18c/0x3d0 devmactionrelease+0x30/0x50 releasenodes+0x68/0x140 devresreleasegroup+0x100/0x190 daxpmemcompatrelease+0x44/0x80 [daxpmemcompat] deviceforeachchild+0x8c/0x100 [daxpmemcompatremove+0x2c/0x50 [daxpmemcompat] nvdimmbusremove+0x78/0x140 [libnvdimm] device_remove+0x70/0xd0
Another issue is that if there is no altmap, a PMD-sized vmemmap page will be allocated from RAM, regardless of the alignment of the section start address. If the section start address is not aligned to the PMD size, a VMBUGON will be triggered when setting the PMD-sized page to page table.
In this patch, we are aligning the section vmemmap start address to PAGE_SIZE. After alignment, the start address will not be part of the current namespace, and a normal page will be allocated for the vmemmap mapping of the current section. For the remaining sections, altmaps will be allocated. During the free operation, the normal page will be correctly freed.
In the same way, a PMDSIZE vmemmap page will be allocated only if the section start address is PMDSIZE-aligned; otherwise, it will fall back to a PAGE-sized vmemmap allocation.
NS1 start NS2 start
| NS1 | NS2 |
| Altmap| Altmap | .....|Altmap| Altmap | ...........
| NS1 | NS1
---truncated---