If using keep_typographic_whitespace=False
(which is the default), the sanitizer normalizes unicode to the NFKC form at the end. Some unicode characters normalize to chevrons; this allows specially crafted HTML to escape sanitization.
The problem has been fixed in 2.4.2.
Set keep_typographic_whitespace=True
explicitly, or normalize to NFKC yourself earlier.
{ "nvd_published_at": null, "cwe_ids": [], "severity": "HIGH", "github_reviewed": true, "github_reviewed_at": "2024-05-06T14:33:32Z" }