ujson.dumps() (or ujson.dump() or ujson.encode()) have a reject_bytes=False option. When set, they may accept malformed or truncated UTF-8 byte sequences, silently rewriting them into different Unicode characters instead of rejecting them. This leads to input validation bypass and data integrity issues.
The expected behavior is that for x being any bytes string, x == ujson.loads(ujson.dumps(x, reject_bytes=False)).encode(errors="surrogatepass") should always either be true or ujson.dumps() will throw an exception. In reality, some strings which should've been errors are silently rewritten as other strings:
b'\xcf\x13' -> b'\xcf\x93'b'\xc3' -> b'\xc3\x80'b'\xf0\x90\x94' -> b"\xf0\x90\x94\x80inxcontrib'"An application relying on reject_bytes=False for UTF-8 handling may experience:
The missing/broken UTF-8 validation checks were added/fixed in https://github.com/ultrajson/ultrajson/commit/169eaf36b1116fece5034ee79a7a0ef3f6deedcf. We recommend upgrading to UltraJSON 5.13.0.
Decoding bytes to strings in Python before passing them to ujson.dumps() avoids this issue.
{
"nvd_published_at": null,
"github_reviewed_at": "2026-06-19T20:47:43Z",
"cwe_ids": [
"CWE-20"
],
"severity": "MODERATE",
"github_reviewed": true
}