to_markdown() does not sufficiently escape text content that looks like HTML. As a result, untrusted input that is safe in to_html() can become raw HTML in Markdown output.
This is not specific to tokenizer raw-text states like <title>, <noscript>, or <plaintext>, although those states can trigger the behavior. The root cause is broader: Markdown text serialization leaves angle brackets unescaped in text nodes.
When converting a parsed document to Markdown, text nodes are escaped for a small set of Markdown metacharacters, but HTML-significant characters such as < and > are preserved. That means content parsed as text, including entity-decoded text or text produced by RCDATA/RAWTEXT-style parsing, can be emitted into Markdown as raw HTML.
Examples of affected input include:
<script>...</script><title>, <textarea>, <noscript> (when parsed as raw text), and <plaintext>This is distinct from actual <script> or <style> elements in the DOM. Those are already dropped by default in to_markdown() unless html_passthrough=True.
```python from justhtml import JustHTML
doc = JustHTML("<p><img src=x onerror=alert(1)></p>", fragment=True)
print(doc.tohtml()) print() print(doc.tomarkdown())
{
"github_reviewed_at": "2026-03-18T20:19:56Z",
"github_reviewed": true,
"cwe_ids": [
"CWE-79"
],
"nvd_published_at": null,
"severity": "MODERATE"
}