llama.cpp is an inference of several LLM models in C/C++. Prior to version b5662, an attacker‐supplied GGUF model vocabulary can trigger a buffer overflow in llama.cpp’s vocabulary‐loading code. Specifically, the helper trycopy in llama.cpp/src/vocab.cpp: llamavocab::impl::tokentopiece() casts a very large sizet token length into an int32t, causing the length check (if (length < (int32t)size)) to be bypassed. As a result, memcpy is still called with that oversized size, letting a malicious model overwrite memory beyond the intended buffer. This can lead to arbitrary memory corruption and potential code execution. This issue has been patched in version b5662.
{
"cwe_ids": [
"CWE-119",
"CWE-195"
],
"osv_generated_from": "https://github.com/CVEProject/cvelistV5/tree/main/cves/2025/49xxx/CVE-2025-49847.json",
"cna_assigner": "GitHub_M"
}"https://storage.googleapis.com/cve-osv-conversion/osv-output/CVE-2025-49847.json"
"2026-04-12T16:41:32Z"
[
{
"signature_version": "v1",
"deprecated": false,
"digest": {
"threshold": 0.9,
"line_hashes": [
"169678252353221686058327076705983110950",
"111020387473770866180977401242009398056",
"181055122984592006969397469993317110311",
"316824732550221164803388954236044593333",
"321477996013993916131729265106299645296",
"149834009032174801870473138042267605522",
"48736868453122173412669636816078777749",
"25222289046965014859768863099826867804",
"277823017251932695168500914324050378445",
"275023068415373625420142878551355670904",
"107181731931003307835415921280185529094",
"306780858145501498281366326076963419249",
"163037821633632488959166563335604107814",
"334423476380994599270006155106083136215",
"338020793754324996781781734297939932784"
]
},
"source": "https://github.com/ggml-org/llama.cpp/commit/fb85a288d72abbd5e5daa8de96e6f8bfa7b5ab46",
"id": "CVE-2025-49847-ff8b2773",
"signature_type": "Line",
"target": {
"file": "src/llama-vocab.cpp"
}
}
]