GHSA-7fqq-q52p-2jjg

Source

https://github.com/advisories/GHSA-7fqq-q52p-2jjg

Import Source

https://github.com/github/advisory-database/blob/main/advisories/github-reviewed/2026/03/GHSA-7fqq-q52p-2jjg/GHSA-7fqq-q52p-2jjg.json

JSON Data

https://api.osv.dev/v1/vulns/GHSA-7fqq-q52p-2jjg

Published

2026-03-29T15:27:41Z

Modified

2026-03-29T15:46:35.754685Z

Severity

6.5 (Medium) CVSS_V3 - CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:N/A:L CVSS Calculator

Summary

OpenCC has an Out-of-bounds read when processing truncated UTF-8 input

Details

Summary

OpenCC versions before 1.2.0 contain two CWE-125: Out-of-bounds Read issues caused by length validation failures in UTF-8 processing. When handling malformed or truncated UTF-8 input, OpenCC trusted derived length values without enforcing the invariant that processed length must not exceed the remaining input buffer. This could result in out-of-bounds reads during segmentation or conversion.

Details

Two independent code paths in OpenCC failed to enforce the invariant:

matchedLength <= remainingLength

Both paths assumed derived length values were valid and within input bounds, but did not validate that assumption against the remaining buffer. This created the following failure chain:

invalid UTF-8 -> incorrect derived length -> incorrect pointer advance -> remaining-length desynchronization -> out-of-bounds read

In MaxMatchSegmentation::Segment, this could desynchronize remaining-length tracking and cause out-of-bounds reads during prefix matching.

In Conversion::Convert(const char*), similar logic could advance processing past the end of the input string and read beyond the null terminator into adjacent memory. In some cases, unintended heap bytes could be propagated into the conversion result.

PR #1005 fixes both issues by explicitly tracking input boundaries, recomputing remaining length on each iteration, and clamping processed lengths so the buffer-bound invariant is preserved.

Affected versions:

All versions before 1.2.0

Patched version:

1.2.0

PoC

Build a vulnerable version with AddressSanitizer enabled and process input ending with a truncated UTF-8 sequence, such as a missing final byte of a 3-byte character. The original report and ASan reproduction are available in Issue #997.

Impact

This vulnerability may cause process crashes and limited, non-deterministic information disclosure when OpenCC processes malformed or attacker-controlled UTF-8 input. The issue does not indicate arbitrary write or code execution.

OpenCC is distributed through system and language-specific package managers, prebuilt binaries, container images, and downstream software, so affected versions may be present even when it is not listed as a direct dependency. Users should upgrade all installed or bundled copies of OpenCC to 1.2.0 or later.

Credit

OpenCC thanks @oneafter for reporting the issue.

Database specific

{
    "github_reviewed": true,
    "github_reviewed_at": "2026-03-29T15:27:41Z",
    "severity": "MODERATE",
    "nvd_published_at": null,
    "cwe_ids": [
        "CWE-125"
    ]
}

References

Affected packages

npm / opencc

Package

Name: opencc; View open source insights on deps.dev
Purl: pkg:npm/opencc

Affected ranges

Type: SEMVER
Events: Introduced

0Unknown introduced version / All previous versions are affected

Fixed

1.2.0

Database specific

source

"https://github.com/github/advisory-database/blob/main/advisories/github-reviewed/2026/03/GHSA-7fqq-q52p-2jjg/GHSA-7fqq-q52p-2jjg.json"

PyPI / opencc

Package

Name: opencc; View open source insights on deps.dev
Purl: pkg:pypi/opencc

Affected ranges

Type: ECOSYSTEM
Events: Introduced

0Unknown introduced version / All previous versions are affected

Fixed

1.2.0

Affected versions

0.*

0.1

0.2

1.*

1.1.0

1.1.0.post1

1.1.1

1.1.1.post1

1.1.2

1.1.3

1.1.4

1.1.5

1.1.6

1.1.7

1.1.8

1.1.9

Database specific

source

"https://github.com/github/advisory-database/blob/main/advisories/github-reviewed/2026/03/GHSA-7fqq-q52p-2jjg/GHSA-7fqq-q52p-2jjg.json"