GHSA-7fqq-q52p-2jjg

Suggest an improvement
Source
https://github.com/advisories/GHSA-7fqq-q52p-2jjg
Import Source
https://github.com/github/advisory-database/blob/main/advisories/github-reviewed/2026/03/GHSA-7fqq-q52p-2jjg/GHSA-7fqq-q52p-2jjg.json
JSON Data
https://api.osv.dev/v1/vulns/GHSA-7fqq-q52p-2jjg
Published
2026-03-29T15:27:41Z
Modified
2026-03-29T15:46:35.754685Z
Severity
  • 6.5 (Medium) CVSS_V3 - CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:N/A:L CVSS Calculator
Summary
OpenCC has an Out-of-bounds read when processing truncated UTF-8 input
Details

Summary

OpenCC versions before 1.2.0 contain two CWE-125: Out-of-bounds Read issues caused by length validation failures in UTF-8 processing. When handling malformed or truncated UTF-8 input, OpenCC trusted derived length values without enforcing the invariant that processed length must not exceed the remaining input buffer. This could result in out-of-bounds reads during segmentation or conversion.

Details

Two independent code paths in OpenCC failed to enforce the invariant:

matchedLength <= remainingLength

Both paths assumed derived length values were valid and within input bounds, but did not validate that assumption against the remaining buffer. This created the following failure chain:

invalid UTF-8 -> incorrect derived length -> incorrect pointer advance -> remaining-length desynchronization -> out-of-bounds read

In MaxMatchSegmentation::Segment, this could desynchronize remaining-length tracking and cause out-of-bounds reads during prefix matching.

In Conversion::Convert(const char*), similar logic could advance processing past the end of the input string and read beyond the null terminator into adjacent memory. In some cases, unintended heap bytes could be propagated into the conversion result.

PR #1005 fixes both issues by explicitly tracking input boundaries, recomputing remaining length on each iteration, and clamping processed lengths so the buffer-bound invariant is preserved.

Affected versions:

  • All versions before 1.2.0

Patched version:

  • 1.2.0

PoC

Build a vulnerable version with AddressSanitizer enabled and process input ending with a truncated UTF-8 sequence, such as a missing final byte of a 3-byte character. The original report and ASan reproduction are available in Issue #997.

Impact

This vulnerability may cause process crashes and limited, non-deterministic information disclosure when OpenCC processes malformed or attacker-controlled UTF-8 input. The issue does not indicate arbitrary write or code execution.

OpenCC is distributed through system and language-specific package managers, prebuilt binaries, container images, and downstream software, so affected versions may be present even when it is not listed as a direct dependency. Users should upgrade all installed or bundled copies of OpenCC to 1.2.0 or later.

Credit

OpenCC thanks @oneafter for reporting the issue.

Database specific
{
    "github_reviewed": true,
    "github_reviewed_at": "2026-03-29T15:27:41Z",
    "severity": "MODERATE",
    "nvd_published_at": null,
    "cwe_ids": [
        "CWE-125"
    ]
}
References

Affected packages

npm / opencc

Package

Affected ranges

Type
SEMVER
Events
Introduced
0Unknown introduced version / All previous versions are affected
Fixed
1.2.0

Database specific

source
"https://github.com/github/advisory-database/blob/main/advisories/github-reviewed/2026/03/GHSA-7fqq-q52p-2jjg/GHSA-7fqq-q52p-2jjg.json"

PyPI / opencc

Package

Affected ranges

Type
ECOSYSTEM
Events
Introduced
0Unknown introduced version / All previous versions are affected
Fixed
1.2.0

Affected versions

0.*
0.1
0.2
1.*
1.1.0
1.1.0.post1
1.1.1
1.1.1.post1
1.1.2
1.1.3
1.1.4
1.1.5
1.1.6
1.1.7
1.1.8
1.1.9

Database specific

source
"https://github.com/github/advisory-database/blob/main/advisories/github-reviewed/2026/03/GHSA-7fqq-q52p-2jjg/GHSA-7fqq-q52p-2jjg.json"