← Back to docs

Error Codes

Parse errors that JustHTML can detect and report.

Collecting Errors

By default, JustHTML silently recovers from errors (like browsers do). To collect errors:

from justhtml import JustHTML

doc = JustHTML("<p>Hello", collect_errors=True)
for error in doc.errors:
    print(f"{error.line}:{error.column} - {error.category}:{error.code}")

doc.errors is ordered by source position (line, column), with unknown positions (if any) appearing last.

Error Categories

Each error has a category field:

Strict Mode

To reject malformed HTML entirely:

from justhtml import JustHTML, StrictModeError

try:
    doc = JustHTML("<p>Hello", strict=True)
except StrictModeError as e:
    print(e)  # Shows source location

In strict mode, JustHTML raises on the earliest error by source position.

Error Locations (Line/Column)

JustHTML reports a source location for each parse error as a best-effort pointer to where the parser detected the problem in the input stream.

This means error locations are not universally “at the beginning” or “at the end” of a token: character-level errors point at the character, while token-level (tree builder) errors generally point at the triggering token’s start.

Node Locations (Optional)

Sometimes you want a source location for a node, not just for parse errors.

For performance reasons, node locations are disabled by default. To enable them, pass track_node_locations=True when parsing:

from justhtml import JustHTML

doc = JustHTML("<p>hi</p>", track_node_locations=True)
p = doc.query("p")[0]

print(p.origin_location)  # (1, 1)
print(p.origin_line)      # 1
print(p.origin_col)       # 1
print(p.origin_offset)    # 0 (0-indexed)

Each node exposes best-effort origin metadata:

Notes:

Example: Reporting missing includes

import sys
from pathlib import Path

from justhtml import JustHTML


with open(sys.argv[1]) as f:
    html = f.read()

doc = JustHTML(html, track_node_locations=True)
for include_node in doc.query("x-include"):
    src = include_node.attrs.get("src", "")
    if not Path(src).exists():
        line, col = include_node.origin_location or (0, 0)
        print(f"Missing include source: {src} ({sys.argv[1]}:{line}.{col})")

Tokenizer Errors

Errors detected during tokenization (lexical analysis).

DOCTYPE Errors

Code Description
eof-in-doctype Unexpected end of file in DOCTYPE declaration
eof-in-doctype-name Unexpected end of file while reading DOCTYPE name
eof-in-doctype-public-identifier Unexpected end of file in DOCTYPE public identifier
eof-in-doctype-system-identifier Unexpected end of file in DOCTYPE system identifier
expected-doctype-name-but-got-right-bracket Expected DOCTYPE name but got >
missing-whitespace-before-doctype-name Missing whitespace after <!DOCTYPE
abrupt-doctype-public-identifier DOCTYPE public identifier ended abruptly
abrupt-doctype-system-identifier DOCTYPE system identifier ended abruptly
missing-quote-before-doctype-public-identifier Missing quote before DOCTYPE public identifier
missing-quote-before-doctype-system-identifier Missing quote before DOCTYPE system identifier
missing-doctype-public-identifier Missing DOCTYPE public identifier
missing-doctype-system-identifier Missing DOCTYPE system identifier
missing-whitespace-before-doctype-public-identifier Missing whitespace before DOCTYPE public identifier
missing-whitespace-after-doctype-public-identifier Missing whitespace after DOCTYPE public identifier
missing-whitespace-between-doctype-public-and-system-identifiers Missing whitespace between DOCTYPE identifiers
missing-whitespace-after-doctype-name Missing whitespace after DOCTYPE name
unexpected-character-after-doctype-public-keyword Unexpected character after PUBLIC keyword
unexpected-character-after-doctype-system-keyword Unexpected character after SYSTEM keyword
unexpected-character-after-doctype-public-identifier Unexpected character after public identifier
unexpected-character-after-doctype-system-identifier Unexpected character after system identifier

Comment Errors

Code Description
eof-in-comment Unexpected end of file in comment
abrupt-closing-of-empty-comment Comment ended abruptly with -->
incorrectly-closed-comment Comment ended with --!> instead of -->
incorrectly-opened-comment Incorrectly opened comment

Tag Errors

Code Description
eof-in-tag Unexpected end of file in tag
eof-before-tag-name Unexpected end of file before tag name
empty-end-tag Empty end tag </> is not allowed
invalid-first-character-of-tag-name Invalid first character of tag name
unexpected-question-mark-instead-of-tag-name Unexpected ? instead of tag name
unexpected-character-after-solidus-in-tag Unexpected character after / in tag

Attribute Errors

Code Description
duplicate-attribute Duplicate attribute name
missing-attribute-value Missing attribute value
unexpected-character-in-attribute-name Unexpected character in attribute name
unexpected-character-in-unquoted-attribute-value Unexpected character in unquoted attribute value
missing-whitespace-between-attributes Missing whitespace between attributes
unexpected-equals-sign-before-attribute-name Unexpected = before attribute name

Script Errors

Code Description
eof-in-script-html-comment-like-text Unexpected end of file in script with HTML-like comment
eof-in-script-in-script Unexpected end of file in nested script tag

CDATA Errors

Code Description
eof-in-cdata Unexpected end of file in CDATA section
cdata-in-html-content CDATA section only allowed in SVG/MathML content

Character Reference Errors

Code Description
control-character-reference Invalid control character in character reference
illegal-codepoint-for-numeric-entity Invalid codepoint in numeric character reference
missing-semicolon-after-character-reference Missing semicolon after character reference
named-entity-without-semicolon Named entity used without semicolon
noncharacter-character-reference Noncharacter in character reference

Other Tokenizer Errors

Code Description
unexpected-null-character Unexpected NULL character (U+0000)
noncharacter-in-input-stream Noncharacter in input stream

Tree Builder Errors

Errors detected during tree construction.

DOCTYPE Errors

Code Description
unexpected-doctype Unexpected DOCTYPE declaration
unknown-doctype Unknown DOCTYPE (expected <!DOCTYPE html>)
expected-doctype-but-got-chars Expected DOCTYPE but got text content
expected-doctype-but-got-eof Expected DOCTYPE but reached end of file
expected-doctype-but-got-start-tag Expected DOCTYPE but got start tag
expected-doctype-but-got-end-tag Expected DOCTYPE but got end tag

Unexpected Tag Errors

Code Description
unexpected-start-tag Unexpected start tag in current context
unexpected-end-tag Unexpected end tag in current context
unexpected-end-tag-before-html Unexpected end tag before <html>
unexpected-end-tag-before-head Unexpected end tag before <head>
unexpected-end-tag-after-head Unexpected end tag after <head>
unexpected-start-tag-ignored Start tag ignored in current context
unexpected-start-tag-implies-end-tag Start tag implicitly closes previous element

EOF Errors

Code Description
expected-closing-tag-but-got-eof Expected closing tag but reached end of file
expected-named-closing-tag-but-got-eof Expected specific closing tag but reached end of file

Invalid Character Errors

Code Description
invalid-codepoint Invalid character (U+0000 NULL or U+000C FORM FEED)
invalid-codepoint-before-head Invalid character before <head>
invalid-codepoint-in-body Invalid character in <body>
invalid-codepoint-in-table-text Invalid character in table text
invalid-codepoint-in-select Invalid character in <select>
invalid-codepoint-in-foreign-content Invalid character in SVG/MathML content

Table Errors

Code Description
foster-parenting-character Text content in table requires foster parenting
foster-parenting-start-tag Start tag in table requires foster parenting
unexpected-character-implies-table-voodoo Unexpected character in table triggers foster parenting
unexpected-start-tag-implies-table-voodoo Start tag in table triggers foster parenting
unexpected-end-tag-implies-table-voodoo End tag in table triggers foster parenting
unexpected-implied-end-tag-in-table-view Unexpected implied end tag while closing table
eof-in-table Unexpected end of file in table
unexpected-cell-in-table-body Unexpected table cell outside of table row
unexpected-form-in-table Form element not allowed in table context
unexpected-hidden-input-in-table Hidden input in table triggers foster parenting

Frameset Errors

Code Description
unexpected-token-in-frameset Unexpected content in <frameset>
unexpected-token-after-frameset Unexpected content after <frameset>
unexpected-token-after-after-frameset Unexpected content after frameset closed

After-Body Errors

Code Description
unexpected-token-after-body Unexpected content after </body>
unexpected-char-after-body Unexpected character after </body>

Column Group / Template Table Context Errors

Code Description
unexpected-characters-in-column-group Text not allowed in <colgroup>
unexpected-characters-in-template-column-group Text not allowed in template column group
unexpected-start-tag-in-column-group Start tag not allowed in <colgroup>
unexpected-start-tag-in-template-column-group Start tag not allowed in template column group
unexpected-start-tag-in-template-table-context Start tag not allowed in template table context

Fragment Context Errors

Code Description
unexpected-start-tag-in-cell-fragment Start tag not allowed in cell fragment context
unexpected-end-tag-in-fragment-context End tag not allowed in fragment parsing context

Head/Body Context Errors

Code Description
unexpected-hidden-input-after-head Unexpected hidden input after <head>

Foreign Content Errors

Code Description
unexpected-doctype-in-foreign-content Unexpected DOCTYPE in SVG/MathML content
unexpected-html-element-in-foreign-content HTML element breaks out of SVG/MathML content
unexpected-end-tag-in-foreign-content Mismatched end tag in SVG/MathML content

Select Errors

Code Description
unexpected-start-tag-in-select Unexpected start tag in <select>
unexpected-end-tag-in-select Unexpected end tag in <select>
unexpected-select-in-select Unexpected nested <select> in <select>

Miscellaneous Errors

Code Description
end-tag-too-early End tag closed early (unclosed children)
adoption-agency-1.3 Misnested tags require adoption agency algorithm
non-void-html-element-start-tag-with-trailing-solidus Self-closing syntax on non-void element (e.g., <div/>)
image-start-tag Deprecated <image> tag (use <img> instead)

Security Errors

Errors reported by the sanitizer when you opt in via unsafe_handling="collect".

Code Description
unsafe-html Unsafe HTML detected by sanitization policy (see error.message for details)