vLLM is a library for LLM inference and serving. vllm/modelexecutor/weightutils.py implements hfmodelweightsiterator to load the model checkpoint, which is downloaded from huggingface. It uses the torch.load function and the weightsonly parameter defaults to False. When torch.load loads malicious pickle data, it will execute arbitrary code during unpickling. This vulnerability is fixed in v0.7.0.
{
"cwe_ids": [
"CWE-502"
]
}