← Back to docs

Correctness Testing

JustHTML is the only pure-Python HTML5 parser that passes 100% of the official html5lib test suite. This page explains how we verify and maintain that compliance.

The html5lib Test Suite

The html5lib-tests repository is the gold standard for HTML5 parsing compliance. It’s used by browser vendors to verify their implementations against the WHATWG HTML5 specification.

Our checked-in test inputs contain:

What the Tests Cover

The tests verify correct handling of:

Example Test Case

Here’s what a test case looks like (from tests1.dat):

#data
<b><p></b></i>

#errors
(1:9) Unexpected end tag </i>

#document
| <html>
|   <head>
|   <body>
|     <b>
|     <p>
|       <b>

This tests the adoption agency algorithm - when </b> is encountered inside <p>, the browser doesn’t just close <b>. Instead, it splits the formatting across the block element boundary.

Compliance Comparison

We run the same test suite against other Python parsers to compare compliance. The cross-parser snapshot below used the 1,743 cases available when it was recorded; the current JustHTML gate covers 1,791 enabled cases.

Parser Tests Passed Compliance Notes
JustHTML 1743/1743 100% Full spec compliance in this comparison snapshot; current gate: 1791/1791
selectolax 1743/1743 100% C-based (Lexbor), fast and spec-compliant with dev html5test output API
markupever 1545/1743 89% Rust-based (html5ever), mostly correct
html5lib 1496/1743 86% Reference implementation, but incomplete
html5_parser 862/1743 49% C-based (Gumbo), fast but loses exposed tree information
BeautifulSoup 6/1743 <1% Uses html.parser, not HTML5 compliant
html.parser 6/1743 <1% Python stdlib, basic error recovery only
lxml 5/1743 <1% XML-based, not HTML5 compliant

Run python benchmarks/correctness.py to reproduce these results. The selectolax score requires its dev html5test output and fragment-context APIs. These scores were refreshed against html5lib-tests e446320.

These numbers come from a strict tree comparison against the expected output in the html5lib-tests tree-construction fixtures (excluding #script-on / #script-off cases). Unsupported parser capabilities count as failures for this compliance table. The numbers will not match the html5lib project’s own reported totals, because html5lib runs the suite in multiple configurations and also has its own skip/xfail lists.

Our Testing Strategy

1. Official and project test suite

We run the complete html5lib test suite on every commit:

python run_tests.py

To run only a single suite (useful for faster iteration), use --suite:

python run_tests.py --suite tree
python run_tests.py --suite justhtml
python run_tests.py --suite serializer
python run_tests.py --suite encoding
python run_tests.py --suite unit

Output:

PASSED: 3464/3464 passed (100.0%)

There are also 6 expected skips, including scripted (#script-on) cases that require JavaScript execution during parsing.

Per-file results are also written to test-summary.txt, with suite prefixes like html5lib-tests-tree/..., html5lib-tests-serializer/..., html5lib-tests-encoding/..., and justhtml-tests/....

The encoding coverage comes from both:

2. Coverage and parser differential checks

The test suite enforces 100% combined line and branch coverage, including the parser engine:

coverage run run_tests.py && coverage report --fail-under=100

The parser engine is additionally checked behaviorally:

PYTHONPATH=src python benchmarks/html5lib_engine_diff.py \
  --fail-under-rate 1.0 \
  --fail-on-current-exceptions

This requires exact agreement with the reference parser path across every scored html5lib tree-construction case.

3. Fuzz Testing (millions of cases)

We generate random malformed HTML to find crashes and hangs:

python benchmarks/fuzz.py -n 3000000

Output:

============================================================
FUZZING RESULTS: justhtml
============================================================
Total tests:    3000000
Successes:      3000000
Crashes:        0
Hangs (>5s):    0
Total time:     928s
Tests/second:   3232

The fuzzer generates truly nasty edge cases:

4. Custom Edge Case Tests

We maintain additional tests in tests/justhtml-tests/ for:

Running the Tests

Quick Start

# Clone the test suite (one-time setup)
cd ..
git clone https://github.com/html5lib/html5lib-tests.git
cd justhtml

# Create symlinks
cd tests
ln -s ../../html5lib-tests/tree-construction html5lib-tests-tree
ln -s ../../html5lib-tests/serializer html5lib-tests-serializer
ln -s ../../html5lib-tests/encoding html5lib-tests-encoding
cd ..

# Run all tests
python run_tests.py

Test Runner Options

# Verbose output with diffs
python run_tests.py -v

# Run specific test file
python run_tests.py --test-specs test2.test:5,10

# Stop on first failure
python run_tests.py -x

# Check for regressions against baseline
python run_tests.py --regressions

Correctness Benchmark

Compare against other parsers:

python benchmarks/correctness.py

Why 100% Matters

HTML5 parsing is notoriously complex. The spec describes intricate parsing behavior with:

Getting 99% compliance means you’re still breaking on real-world edge cases. Browsers pass 100% because they have to - and now JustHTML does too.

Error Diagnostics

The html5lib suite verifies tree output, not a standardized diagnostic stream. JustHTML therefore reports a small set of high-value errors instead of duplicating the parser to reproduce every detailed recovery diagnostic:

doc = JustHTML("<!doctype html><!--", collect_errors=True)
for error in doc.errors:
    print(f"{error.line}:{error.column} {error.code}")
# Output: 1:19 eof-in-comment

Error collection is optional and adds work. Strict mode raises on the earliest supported diagnostic, but is not a complete HTML conformance validator.

See Error Codes for the supported set and stability contract.