Module

parsing.inline.links

Link and image parsing for Patitas parser.

Handles inline links, reference links, images, and footnote references.

CommonMark 0.31.2 compliance:

  • Link destinations can be angle-bracket delimited or raw
  • Angle-bracket destinations: no newlines, can have spaces
  • Raw destinations: no spaces, no control chars, balanced parens
  • Backslash escapes work in destinations and titles

Classes

LinkParsingMixin 4
Mixin for link and image parsing. Required Host Attributes: - _link_refs: dict[str, tuple[str,…

Mixin for link and image parsing.

Required Host Attributes:

  • _link_refs: dict[str, tuple[str, str]]

Required Host Methods:

  • _parse_inline(text, location) -> tuple[Inline, ...]

Attributes

Name Type Description
_link_refs dict[str, tuple[str, str]]

Methods

Internal Methods 3
_try_parse_footnote_ref 3 tuple[FootnoteRef, int] …
Try to parse a footnote reference at position. Format: [^identifier] Returns (…
def _try_parse_footnote_ref(self, text: str, pos: int, location: SourceLocation) -> tuple[FootnoteRef, int] | None

Try to parse a footnote reference at position.

Format: [^identifier] Returns (FootnoteRef, new_position) or None if not a footnote ref.

Parameters
Name Type Description
text
pos
location
Returns
tuple[FootnoteRef, int] | None
_try_parse_link 3 tuple[Link, int] | None
Try to parse a link at position. Handles: - [text](url) - inline link - [text]…
def _try_parse_link(self, text: str, pos: int, location: SourceLocation) -> tuple[Link, int] | None

Try to parse a link at position.

Handles:

  • text - inline link
  • [text][ref] - full reference link
  • [text][] - collapsed reference link
  • [ref] - shortcut reference link

Returns (Link, new_position) or None if not a link.

Parameters
Name Type Description
text
pos
location
Returns
tuple[Link, int] | None
_try_parse_image 3 tuple[Image, int] | None
Try to parse an image at position. Handles: - ![alt](url) - inline image - ![a…
def _try_parse_image(self, text: str, pos: int, location: SourceLocation) -> tuple[Image, int] | None

Try to parse an image at position.

Handles:

  • alt - inline image
  • ![alt][ref] - full reference image
  • ![alt][] - collapsed reference image
  • ![alt] - shortcut reference image

Returns (Image, new_position) or None if not an image.

Parameters
Name Type Description
text
pos
location
Returns
tuple[Image, int] | None

Functions

_process_escapes 1 str
Process backslash escapes in link URLs and titles. CommonMark: Backslash escap…
def _process_escapes(text: str) -> str

Process backslash escapes in link URLs and titles.

CommonMark: Backslash escapes work in link destinations and titles. A backslash followed by ASCII punctuation is replaced with the literal char.

Parameters
Name Type Description
text str

Raw text that may contain backslash escapes

Returns
str
_unescape_label 1 str
Unescape label-specific escapes (backslash, [, ]). CommonMark: Backslash escap…
def _unescape_label(label: str) -> str

Unescape label-specific escapes (backslash, [, ]).

CommonMark: Backslash escapes are allowed in labels but only matter for backslash and bracket characters. Other escapes remain literal so that labels like[foo\!]do not match[foo!](spec example 545).

Parameters
Name Type Description
label str
Returns
str
_normalize_label 1 str
Normalize a link reference label for matching. CommonMark 4.7: "Label matching…
def _normalize_label(label: str) -> str

Normalize a link reference label for matching.

CommonMark 4.7: "Label matching is case-insensitive and Unicode case fold equivalent. Spaces, tabs, and line endings are normalized to single space."

Parameters
Name Type Description
label str

Raw label text

Returns
str
_parse_link_destination 2 tuple[str, int] | None
Parse a link destination starting at pos. CommonMark 6.5: Link destination is …
def _parse_link_destination(text: str, pos: int) -> tuple[str, int] | None

Parse a link destination starting at pos.

CommonMark 6.5: Link destination is either:

  1. Angle-bracket delimited: (can contain spaces, no newlines or unescaped </>)
  2. Raw URL: sequence of non-space chars with balanced parens
Parameters
Name Type Description
text str

The full text being parsed

pos int

Position after the opening (

Returns
tuple[str, int] | None
_parse_link_title 2 tuple[str | None, int]
Parse an optional link title starting at pos. CommonMark: Title is enclosed in…
def _parse_link_title(text: str, pos: int) -> tuple[str | None, int]

Parse an optional link title starting at pos.

CommonMark: Title is enclosed in ", ', or () Can span lines but opening/closing delimiters must match.

Parameters
Name Type Description
text str

The full text being parsed

pos int

Position to start looking for title

Returns
tuple[str | None, int]
_parse_inline_link 2 tuple[str, str | None, i…
Parse an inline link destination and optional title. Format: (url) or (url "ti…
def _parse_inline_link(text: str, pos: int) -> tuple[str, str | None, int] | None

Parse an inline link destination and optional title.

Format: (url) or (url "title") or ( 'title')

Parameters
Name Type Description
text str

Full text being parsed

pos int

Position at the opening (

Returns
tuple[str, str | None, int] | None
_skip_html_tag 2 int
Skip over an HTML tag starting at pos. Handles open tags, close tags, and self…
def _skip_html_tag(text: str, pos: int) -> int

Skip over an HTML tag starting at pos.

Handles open tags, close tags, and self-closing tags. Properly handles quoted attribute values that may contain special chars.

Parameters
Name Type Description
text str

Full text to search

pos int

Position at the opening <

Returns
int
_find_closing_bracket 2 int
Find closing bracket ] while respecting code spans, HTML tags, and nested brack…
def _find_closing_bracket(text: str, start: int) -> int

Find closing bracket ] while respecting code spans, HTML tags, and nested brackets.

CommonMark: Code spans have higher precedence than link text brackets. A code span inside link text means the ] inside the code span doesn't count. HTML tags protect their contents - ] inside HTML attribute values doesn't count. Nested brackets [ ] are allowed inside link text.

Parameters
Name Type Description
text str

Full text to search

start int

Position to start searching (should be after opening [)

Returns
int
_extract_plain_text 1 str
Extract plain text from inline content for image alt text. CommonMark: Image a…
def _extract_plain_text(text: str) -> str

Extract plain text from inline content for image alt text.

CommonMark: Image alt text is the plain text content with formatting stripped. E.g., "foo bar" becomes "foo bar".

Parameters
Name Type Description
text str

Raw inline content that may contain formatting

Returns
str
_contains_link 1 bool
Check if children contain a Link node at any nesting level. CommonMark: Links …
def _contains_link(children: tuple) -> bool

Check if children contain a Link node at any nesting level.

CommonMark: Links may not contain other links, at any level of nesting. If parsing link text produces a link, the outer link is invalid.

Parameters
Name Type Description
children tuple

Tuple of inline nodes

Returns
bool