If using keep_typographic_whitespace=False
(which is the default), the sanitizer normalizes unicode to the NFKC form at the end. Some unicode characters normalize to chevrons; this allows specially crafted HTML to escape sanitization.
The problem has been fixed in 2.4.2.
Set keep_typographic_whitespace=True
explicitly, or normalize to NFKC yourself earlier.
{ "severity": "HIGH", "nvd_published_at": null, "cwe_ids": [], "github_reviewed_at": "2024-05-06T14:33:32Z", "github_reviewed": true }