The <base> tag passes through the default Cleaner configuration. While page_structure=True removes html, head, and title tags, there is no specific handling for <base>, allowing an attacker to inject it and hijack relative links on the page.
The <base> tag is not currently in the page_structure kill set. Even though the specification says <base> must be inside <head>, browsers accept <base> tags outside of the head.
If an attacker injects a <base> tag, it changes the base URL for all relative URLs on the page (links, images, scripts) to a domain controlled by the attacker.
from lxml_html_clean import clean_html
# The base tag is preserved in the output
result = clean_html('<base href="http://evil.com/"><a href="/account">Account</a>')
print(result)
# Output: <div><base href="http://evil.com/">...<a href="/account">Account</a></div>
The injection of a <base> tag allows an attacker to hijack the resolution of all relative URLs on the page. This results in three critical attack vectors:
<a href="/login">) and form submissions (e.g., <form action="/auth">) to an attacker-controlled domain, effectively stealing credentials or sensitive data without the user realizing they have left the legitimate site.<script src="assets/app.js">), the browser will attempt to fetch the script from the attacker's domain. This upgrades the vulnerability from HTML injection to full Stored XSS.<img>) and stylesheets (<link>) will be loaded from the attacker's server, allowing for UI redressing or defacement.{
"github_reviewed": true,
"github_reviewed_at": "2026-03-02T19:35:52Z",
"severity": "MODERATE",
"nvd_published_at": "2026-03-05T20:16:16Z",
"cwe_ids": [
"CWE-116"
]
}