In the Linux kernel, the following vulnerability has been resolved:
device-dax: correct pgoff align in daxsetmapping()
pgoff should be aligned using ALIGNDOWN() instead of ALIGN(). Otherwise, vmf->address not aligned to faultsize will be aligned to the next alignment, that can result in memory failure getting the wrong address.
It's a subtle situation that only can be observed in pagemappedinvma() after the page is page fault handled by devdaxhugefault. Generally, there is little chance to perform pagemappedinvma in dev-dax's page unless in specific error injection to the dax device to trigger an MCE - memory-failure. In that case, pagemappedinvma() will be triggered to determine which task is accessing the failure address and kill that task in the end.
We used self-developed dax device (which is 2M aligned mapping) , to perform error injection to random address. It turned out that error injected to non-2M-aligned address was causing endless MCE until panic. Because pagemappedin_vma() kept resulting wrong address and the task accessing the failure address was never killed properly:
[ 3783.719419] Memory failure: 0x200c9742: recovery action for dax page: Recovered [ 3784.049006] mce: Uncorrected hardware memory error in user-access at 200c9742380 [ 3784.049190] Memory failure: 0x200c9742: recovery action for dax page: Recovered [ 3784.448042] mce: Uncorrected hardware memory error in user-access at 200c9742380 [ 3784.448186] Memory failure: 0x200c9742: recovery action for dax page: Recovered [ 3784.792026] mce: Uncorrected hardware memory error in user-access at 200c9742380 [ 3784.792179] Memory failure: 0x200c9742: recovery action for dax page: Recovered [ 3785.162502] mce: Uncorrected hardware memory error in user-access at 200c9742380 [ 3785.162633] Memory failure: 0x200c9742: recovery action for dax page: Recovered [ 3785.461116] mce: Uncorrected hardware memory error in user-access at 200c9742380 [ 3785.461247] Memory failure: 0x200c9742: recovery action for dax page: Recovered [ 3785.764730] mce: Uncorrected hardware memory error in user-access at 200c9742380 [ 3785.764859] Memory failure: 0x200c9742: recovery action for dax page: Recovered [ 3786.042128] mce: Uncorrected hardware memory error in user-access at 200c9742380 [ 3786.042259] Memory failure: 0x200c9742: recovery action for dax page: Recovered [ 3786.464293] mce: Uncorrected hardware memory error in user-access at 200c9742380 [ 3786.464423] Memory failure: 0x200c9742: recovery action for dax page: Recovered [ 3786.818090] mce: Uncorrected hardware memory error in user-access at 200c9742380 [ 3786.818217] Memory failure: 0x200c9742: recovery action for dax page: Recovered [ 3787.085297] mce: Uncorrected hardware memory error in user-access at 200c9742380 [ 3787.085424] Memory failure: 0x200c9742: recovery action for dax page: Recovered
It took us several weeks to pinpoint this problem, but we eventually used bpftrace to trace the page fault and mce address and successfully identified the issue.
Joao added:
; Likely we never reproduce in production because we always pin : device-dax regions in the region align they provide (Qemu does : similarly with prealloc in hugetlb/file backed memory). I think this : bug requires that we touch unpinned device-dax regions unaligned to : the device-dax selected alignment (page size i.e. 4K/2M/1G)
[
{
"signature_type": "Line",
"deprecated": false,
"signature_version": "v1",
"digest": {
"line_hashes": [
"255523872479182455813402012871739729214",
"131924743838808245142077958612036104399",
"151300951202217743895549581017924323051",
"257134634160848892176806784443413413343"
],
"threshold": 0.9
},
"source": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git@e877427d218159ac29c9326100920d24330c9ee6",
"target": {
"file": "drivers/dax/device.c"
},
"id": "CVE-2024-50022-037188c5"
},
{
"signature_type": "Line",
"deprecated": false,
"signature_version": "v1",
"digest": {
"line_hashes": [
"255523872479182455813402012871739729214",
"131924743838808245142077958612036104399",
"151300951202217743895549581017924323051",
"257134634160848892176806784443413413343"
],
"threshold": 0.9
},
"source": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git@9c4198dfdca818c5ce19c764d90eabd156bbc6da",
"target": {
"file": "drivers/dax/device.c"
},
"id": "CVE-2024-50022-082f121c"
},
{
"signature_type": "Line",
"deprecated": false,
"signature_version": "v1",
"digest": {
"line_hashes": [
"255523872479182455813402012871739729214",
"131924743838808245142077958612036104399",
"151300951202217743895549581017924323051",
"257134634160848892176806784443413413343"
],
"threshold": 0.9
},
"source": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git@b822007e8db341d6f175c645ed79866db501ad86",
"target": {
"file": "drivers/dax/device.c"
},
"id": "CVE-2024-50022-0eb44043"
},
{
"signature_type": "Line",
"deprecated": false,
"signature_version": "v1",
"digest": {
"line_hashes": [
"255523872479182455813402012871739729214",
"131924743838808245142077958612036104399",
"151300951202217743895549581017924323051",
"257134634160848892176806784443413413343"
],
"threshold": 0.9
},
"source": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git@7fcbd9785d4c17ea533c42f20a9083a83f301fa6",
"target": {
"file": "drivers/dax/device.c"
},
"id": "CVE-2024-50022-d2dc78ec"
}
]