In the Linux kernel, the following vulnerability has been resolved:
mm/hugetlb: fix DEBUGLOCKSWARNON(1) when dissolvefreehugetlbfolio()
When I did memory failure tests recently, below warning occurs:
DEBUGLOCKSWARNON(1) WARNING: CPU: 8 PID: 1011 at kernel/locking/lockdep.c:232 lockacquire+0xccb/0x1ca0 Modules linked in: mceinject hwpoisoninject CPU: 8 PID: 1011 Comm: bash Kdump: loaded Not tainted 6.9.0-rc3-next-20240410-00012-gdb69f219f4be #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 RIP: 0010:lockacquire+0xccb/0x1ca0 RSP: 0018:ffffa7a1c7fe3bd0 EFLAGS: 00000082 RAX: 0000000000000000 RBX: eb851eb853975fcf RCX: ffffa1ce5fc1c9c8 RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffffa1ce5fc1c9c0 RBP: ffffa1c6865d3280 R08: ffffffffb0f570a8 R09: 0000000000009ffb R10: 0000000000000286 R11: ffffffffb0f2ad50 R12: ffffa1c6865d3d10 R13: ffffa1c6865d3c70 R14: 0000000000000000 R15: 0000000000000004 FS: 00007ff9f32aa740(0000) GS:ffffa1ce5fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ff9f3134ba0 CR3: 00000008484e4000 CR4: 00000000000006f0 Call Trace: <TASK> lockacquire+0xbe/0x2d0 rawspinlockirqsave+0x3a/0x60 hugepagesubpoolputpages.part.0+0xe/0xc0 freehugefolio+0x253/0x3f0 dissolvefreehugepage+0x147/0x210 pagehandlepoison+0x9/0x70 memoryfailure+0x4e6/0x8c0 hardofflinepagestore+0x55/0xa0 kernfsfopwriteiter+0x12c/0x1d0 vfswrite+0x380/0x540 ksyswrite+0x64/0xe0 dosyscall64+0xbc/0x1d0 entrySYSCALL64afterhwframe+0x77/0x7f RIP: 0033:0x7ff9f3114887 RSP: 002b:00007ffecbacb458 EFLAGS: 00000246 ORIGRAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007ff9f3114887 RDX: 000000000000000c RSI: 0000564494164e10 RDI: 0000000000000001 RBP: 0000564494164e10 R08: 00007ff9f31d1460 R09: 000000007fffffff R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000c R13: 00007ff9f321b780 R14: 00007ff9f3217600 R15: 00007ff9f3216a00 </TASK> Kernel panic - not syncing: kernel: paniconwarn set ... CPU: 8 PID: 1011 Comm: bash Kdump: loaded Not tainted 6.9.0-rc3-next-20240410-00012-gdb69f219f4be #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> panic+0x326/0x350 checkpaniconwarn+0x4f/0x50 _warn+0x98/0x190 reportbug+0x18e/0x1a0 handlebug+0x3d/0x70 excinvalidop+0x18/0x70 asmexcinvalidop+0x1a/0x20 RIP: 0010:lockacquire+0xccb/0x1ca0 RSP: 0018:ffffa7a1c7fe3bd0 EFLAGS: 00000082 RAX: 0000000000000000 RBX: eb851eb853975fcf RCX: ffffa1ce5fc1c9c8 RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffffa1ce5fc1c9c0 RBP: ffffa1c6865d3280 R08: ffffffffb0f570a8 R09: 0000000000009ffb R10: 0000000000000286 R11: ffffffffb0f2ad50 R12: ffffa1c6865d3d10 R13: ffffa1c6865d3c70 R14: 0000000000000000 R15: 0000000000000004 lockacquire+0xbe/0x2d0 _rawspinlockirqsave+0x3a/0x60 hugepagesubpoolputpages.part.0+0xe/0xc0 freehugefolio+0x253/0x3f0 dissolvefreehugepage+0x147/0x210 _pagehandlepoison+0x9/0x70 memoryfailure+0x4e6/0x8c0 hardofflinepagestore+0x55/0xa0 kernfsfopwriteiter+0x12c/0x1d0 vfswrite+0x380/0x540 ksyswrite+0x64/0xe0 dosyscall64+0xbc/0x1d0 entrySYSCALL64afterhwframe+0x77/0x7f RIP: 0033:0x7ff9f3114887 RSP: 002b:00007ffecbacb458 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007ff9f3114887 RDX: 000000000000000c RSI: 0000564494164e10 RDI: 0000000000000001 RBP: 0000564494164e10 R08: 00007ff9f31d1460 R09: 000000007fffffff R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000c R13: 00007ff9f321b780 R14: 00007ff9f3217600 R15: 00007ff9f3216a00 </TASK>
After git bisecting and digging into the code, I believe the root cause is that deferredlist field of folio is unioned with hugetlbsubpool field. In _updateandfreehugetlbfolio(), folio->deferred_ ---truncated---
[
{
"signature_type": "Function",
"digest": {
"function_hash": "333025752880396055842890017321120024593",
"length": 712.0
},
"target": {
"file": "mm/hugetlb.c",
"function": "__update_and_free_hugetlb_folio"
},
"signature_version": "v1",
"id": "CVE-2024-36028-53391516",
"deprecated": false,
"source": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git@52ccdde16b6540abe43b6f8d8e1e1ec90b0983af"
},
{
"signature_type": "Line",
"digest": {
"threshold": 0.9,
"line_hashes": [
"146018048214871711061162910528092636583",
"6607988696898792625685197412827385831",
"89300256452544786786881335850661562399",
"260675871965569803042309147070522461305"
]
},
"target": {
"file": "mm/hugetlb.c"
},
"signature_version": "v1",
"id": "CVE-2024-36028-622b385e",
"deprecated": false,
"source": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git@52ccdde16b6540abe43b6f8d8e1e1ec90b0983af"
},
{
"signature_type": "Function",
"digest": {
"function_hash": "108603060366271541734070848448006371208",
"length": 957.0
},
"target": {
"file": "mm/hugetlb.c",
"function": "__update_and_free_page"
},
"signature_version": "v1",
"id": "CVE-2024-36028-6fcecc6f",
"deprecated": false,
"source": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git@2effe407f7563add41750fd7e03da4ea44b98099"
},
{
"signature_type": "Line",
"digest": {
"threshold": 0.9,
"line_hashes": [
"325441486145398697534790675620706498282",
"67754028127153476470690840261070525068",
"233667196348677319187630982973628860206",
"215026488886735010636164881449839862839",
"146018048214871711061162910528092636583",
"6607988696898792625685197412827385831",
"89300256452544786786881335850661562399",
"260675871965569803042309147070522461305"
]
},
"target": {
"file": "mm/hugetlb.c"
},
"signature_version": "v1",
"id": "CVE-2024-36028-83f6b7ff",
"deprecated": false,
"source": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git@7e0a322877416e8c648819a8e441cf8c790b2cce"
},
{
"signature_type": "Line",
"digest": {
"threshold": 0.9,
"line_hashes": [
"107310032164695039228799023936850195519",
"325807769359430528168241852701049719784",
"154122901995912813967294461790983492187",
"67632216492141358830718313462474466296",
"287505325476905360547226762678386779385",
"141888473021907286922153021787905640669",
"110298229464436256543570345242582717788",
"8319857141546755278024379513635229560"
]
},
"target": {
"file": "mm/hugetlb.c"
},
"signature_version": "v1",
"id": "CVE-2024-36028-8a39c199",
"deprecated": false,
"source": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git@2effe407f7563add41750fd7e03da4ea44b98099"
},
{
"signature_type": "Function",
"digest": {
"function_hash": "333025752880396055842890017321120024593",
"length": 712.0
},
"target": {
"file": "mm/hugetlb.c",
"function": "__update_and_free_hugetlb_folio"
},
"signature_version": "v1",
"id": "CVE-2024-36028-b900f9c2",
"deprecated": false,
"source": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git@9c9b32d46afab2d911897914181c488954012300"
},
{
"signature_type": "Function",
"digest": {
"function_hash": "55237077046568452668053090346120615537",
"length": 715.0
},
"target": {
"file": "mm/hugetlb.c",
"function": "__update_and_free_hugetlb_folio"
},
"signature_version": "v1",
"id": "CVE-2024-36028-d6b84ecc",
"deprecated": false,
"source": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git@7e0a322877416e8c648819a8e441cf8c790b2cce"
},
{
"signature_type": "Line",
"digest": {
"threshold": 0.9,
"line_hashes": [
"146018048214871711061162910528092636583",
"6607988696898792625685197412827385831",
"89300256452544786786881335850661562399",
"260675871965569803042309147070522461305"
]
},
"target": {
"file": "mm/hugetlb.c"
},
"signature_version": "v1",
"id": "CVE-2024-36028-fcf64f09",
"deprecated": false,
"source": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git@9c9b32d46afab2d911897914181c488954012300"
}
]