When vLLM is configured to use Mooncake, unsafe deserialization exposed directly over ZMQ/TCP will allow attackers to execute remote code on distributed hosts.
Only sender_socket and receiver_ack are allowed to be accessed publicly, while the data actually decompressed by pickle.loads() comes from recv_bytes. Its interface is defined as self.receiver_socket.connect(f\"tcp://{d_host}:{d_rank_offset + 1}\"), where d_host is decode_host, a locally defined address 192.168.0.139,from mooncake.json (https://github.com/kvcache-ai/Mooncake/blob/main/doc/en/vllm-integration-v0.2.md?plain=1#L36).
pickle.loads(). Additionally, it does not appear that there are any controls (network, authentication, etc) to prevent arbitrary users from sending this payload to the affected service.This is a remote code execution vulnerability impacting any deployments using Mooncake to distribute KV across distributed hosts.
This issue is resolved by https://github.com/vllm-project/vllm/pull/14228
{
"nvd_published_at": "2025-03-19T16:15:32Z",
"severity": "CRITICAL",
"github_reviewed_at": "2025-03-19T15:55:58Z",
"github_reviewed": true,
"cwe_ids": [
"CWE-502"
]
}