html-sanitizer is an allowlist-based HTML cleaner. If using keep_typographic_whitespace=False (which is the default), the sanitizer normalizes unicode to the NFKC form at the end. Some unicode characters normalize to chevrons; this allows specially crafted HTML to escape sanitization. The problem has been fixed in 2.4.2.
{
"cwe_ids": [
"CWE-79"
],
"cna_assigner": "GitHub_M",
"osv_generated_from": "https://github.com/CVEProject/cvelistV5/tree/main/cves/2024/34xxx/CVE-2024-34078.json"
}