← Back to docs

Linkify

JustHTML’s Linkify transform scans text nodes and wraps detected URLs/emails in <a> elements.

This is a DOM transform (it does not operate on raw HTML strings), so it never rewrites tag soup or breaks markup.

Quickstart

from justhtml import JustHTML, Linkify

doc = JustHTML("<p>See example.com</p>", fragment=True, transforms=[Linkify()])
print(doc.to_html(pretty=False))
# => <p>See <a href="http://example.com">example.com</a></p>

Behavior

Unicode and punycode (IDNA)

Linkify can detect domains containing Unicode characters.

When it generates a link, it normalizes the hostname portion of href using IDNA (punycode). This keeps the visible link text readable while ensuring the href is ASCII-only.

Example:

from justhtml import JustHTML, Linkify

doc = JustHTML("<p>See bücher.de</p>", fragment=True, transforms=[Linkify()])
print(doc.to_html(pretty=False))
# => <p>See <a href="http://xn--bcher-kva.de">bücher.de</a></p>

Notes:

Configuration

from justhtml import JustHTML, Linkify

doc = JustHTML(
    "<p>See 127.0.0.1 and example.dev</p>",
    transforms=[
        Linkify(
            fuzzy_ip=True,
            extra_tlds={"dev"},
            skip_tags={"a", "pre", "textarea", "code", "script", "style"},
        )
    ],
)

Options:

Fuzzy domains and TLD allowlist

For protocol-less “fuzzy” detection (like example.com or test@example.com), Linkify uses a TLD allowlist to reduce false positives.

This allowlist is not used for links that already include an explicit scheme like http://... (those are accepted regardless of TLD). Similarly, mailto: links are accepted even when the domain doesn’t have a recognized TLD.

Default accepted TLDs

By default, Linkify accepts:

Adding extra TLDs

If you want fuzzy matching for newer gTLDs (like .dev, .app, .email, …), pass them via extra_tlds:

from justhtml import JustHTML, Linkify

doc = JustHTML(
        "<p>See example.dev and mail me@company.app</p>",
        transforms=[Linkify(extra_tlds={"dev", "app"})],
)

extra_tlds values are compared case-insensitively and should be provided without a leading dot.

Composing with other transforms

To add attributes to generated links, compose with SetAttrs:

from justhtml import JustHTML, Linkify, SetAttrs

doc = JustHTML(
    "<p>See example.com</p>",
    transforms=[
        Linkify(),
        SetAttrs("a", rel="nofollow", target="_blank"),
    ],
)

Interaction with sanitization

Transforms mutate the in-memory DOM. JustHTML(..., sanitize=True) appends a final Sanitize(...) step unless you include one yourself.

This matters for Linkify because sanitization policies can remove or rewrite attributes on the generated <a> when the final sanitizer runs:

If you want Linkify output without any sanitization changes (trusted input only), use sanitize=False and avoid adding Sanitize(...) in transforms.

Provenance

JustHTML’s Linkify behavior is validated against the upstream linkify-it fixture suite (MIT licensed).