In the Linux kernel, the following vulnerability has been resolved:
x86/mm: Disallow vsyscall page read for copyfromkernel_nofault()
When trying to use copyfromkernel_nofault() to read vsyscall page through a bpf program, the following oops was reported:
BUG: unable to handle page fault for address: ffffffffff600000 #PF: supervisor read access in kernel mode #PF: errorcode(0x0000) - not-present page PGD 3231067 P4D 3231067 PUD 3233067 PMD 3235067 PTE 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 1 PID: 20390 Comm: testprogs ...... 6.7.0+ #58 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) ...... RIP: 0010:copyfromkernelnofault+0x6f/0x110 ...... Call Trace: <TASK> ? copyfromkernelnofault+0x6f/0x110 bpfprobereadkernel+0x1d/0x50 bpfprog2061065e56845f08doproberead+0x51/0x8d tracecallbpf+0xc5/0x1c0 perfcallbpfenter.isra.0+0x69/0xb0 perfsyscallenter+0x13e/0x200 syscalltraceenter+0x188/0x1c0 dosyscall64+0xb5/0xe0 entrySYSCALL64after_hwframe+0x6e/0x76 </TASK> ...... ---[ end trace 0000000000000000 ]---
The oops is triggered when:
1) A bpf program uses bpfprobereadkernel() to read from the vsyscall page and invokes copyfromkernelnofault() which in turn calls _getuser_asm().
2) Because the vsyscall page address is not readable from kernel space, a page fault exception is triggered accordingly.
3) handlepagefault() considers the vsyscall page address as a user space address instead of a kernel space address. This results in the fix-up setup by bpf not being applied and a pagefaultoops() is invoked due to SMAP.
Considering handlepagefault() has already considered the vsyscall page address as a userspace address, fix the problem by disallowing vsyscall page read for copyfromkernel_nofault().