GHSA-wvhx-q427-fgh3

Source
https://github.com/advisories/GHSA-wvhx-q427-fgh3
Import Source
https://github.com/github/advisory-database/blob/main/advisories/github-reviewed/2024/05/GHSA-wvhx-q427-fgh3/GHSA-wvhx-q427-fgh3.json
Aliases
Published
2024-05-06T14:33:32Z
Modified
2024-05-06T15:11:50.090293Z
Summary
Arbitrary HTML present after sanitization because of unicode normalization
Details

Impact

If using keep_typographic_whitespace=False (which is the default), the sanitizer normalizes unicode to the NFKC form at the end. Some unicode characters normalize to chevrons; this allows specially crafted HTML to escape sanitization.

Patches

The problem has been fixed in 2.4.2.

Workarounds

Set keep_typographic_whitespace=True explicitly, or normalize to NFKC yourself earlier.

References

Affected packages

PyPI / html-sanitizer

Package

Affected ranges

Type
ECOSYSTEM
Events
Introduced
0The exact introduced commit is unknown
Fixed
2.4.2

Affected versions

1.*

1.0.0
1.1.0
1.1.1
1.1.2
1.1.3
1.1.4
1.2.0
1.2.1
1.3.0
1.4.0
1.5.0
1.6.0
1.6.1
1.6.2
1.6.3
1.6.4
1.7.0
1.7.1
1.7.2
1.7.3
1.8.0
1.9.0
1.9.1
1.9.2
1.9.3

2.*

2.0.0
2.1.0
2.2.0
2.3.0
2.3.1
2.4.0
2.4.1