Module

parsing.blocks.core

Core block parsing for Patitas parser.

Provides block dispatch and basic block parsing (headings, code, quotes, paragraphs).

Classes

BlockParsingCoreMixin 16
Core block parsing methods. Required Host Attributes: - _source: str - _tokens: list[Token…

Core block parsing methods.

Required Host Attributes:

  • _source: str
  • _tokens: list[Token]
  • _pos: int
  • _current: Token | None
  • _tables_enabled: bool

Required Host Methods:

  • _at_end() -> bool
  • _advance() -> Token | None
  • _parse_inline(text, location) -> tuple[Inline, ...]
  • _parse_list(parent_indent) -> List
  • _parse_directive() -> Directive
  • _parse_footnote_def() -> FootnoteDef
  • _try_parse_table(lines, location) -> Table | None

Attributes

Name Type Description
_source str
_tokens list[Token]
_pos int
_current Token | None
_tables_enabled bool

Methods

Internal Methods 11
_parse_block 0 Block | None
Parse a single block element.
def _parse_block(self) -> Block | None
Returns
Block | None
_parse_atx_heading 0 Heading
Parse ATX heading (# Heading). Supports MyST-compatible explicit anchor syntax…
def _parse_atx_heading(self) -> Heading

Parse ATX heading (# Heading).

Supports MyST-compatible explicit anchor syntax: ## Title {#custom-id}

Returns
Heading
_parse_fenced_code 1 FencedCode
Parse fenced code block with zero-copy coordinates.
def _parse_fenced_code(self, override_fence_indent: int | None = None) -> FencedCode
Parameters
Name Type Description
override_fence_indent

If provided, use this instead of the token's indent. Used for fenced code blocks in list items.

Default:None
Returns
FencedCode
_parse_orphaned_fence_content 0 Paragraph
Parse orphaned FENCED_CODE_CONTENT as paragraph. This happens when a fenced co…
def _parse_orphaned_fence_content(self) -> Paragraph

Parse orphaned FENCED_CODE_CONTENT as paragraph.

This happens when a fenced code block is interrupted (e.g., by block quote ending without >), leaving content tokens orphaned. Treat as paragraph text.

Returns
Paragraph
_parse_orphaned_fence_end 0 FencedCode
Parse orphaned FENCED_CODE_END as new unclosed fenced code block. This happens…
def _parse_orphaned_fence_end(self) -> FencedCode

Parse orphaned FENCED_CODE_END as new unclosed fenced code block.

This happens when a fenced code block is interrupted, and the closing fence is now orphaned. In CommonMark, this becomes a new unclosed fenced code block.

Returns
FencedCode
_parse_thematic_break 0 ThematicBreak
Parse thematic break (---, ***, ___).
def _parse_thematic_break(self) -> ThematicBreak
Returns
ThematicBreak
_parse_html_block 0 HtmlBlock
Parse HTML block (raw HTML content passed through unchanged).
def _parse_html_block(self) -> HtmlBlock
Returns
HtmlBlock
_parse_block_quote 0 BlockQuote
Parse block quote (> quoted). CommonMark 5.1: Block quotes can contain any blo…
def _parse_block_quote(self) -> BlockQuote

Parse block quote (> quoted).

CommonMark 5.1: Block quotes can contain any block-level content, including headings, code blocks, lists, and nested block quotes.

Algorithm:

  1. Consume the first BLOCK_QUOTE_MARKER
  2. Collect content, preserving nested > markers as content
  3. Handle lazy continuation (lines without > that continue paragraphs)
  4. Sub-parse the content for nested blocks
Returns
BlockQuote
_parse_indented_code 0 IndentedCode
Parse indented code block.
def _parse_indented_code(self) -> IndentedCode
Returns
IndentedCode
_parse_paragraph 0 Paragraph | Table | Head…
Parse paragraph (consecutive text lines), table, or setext heading. If the sec…
def _parse_paragraph(self) -> Paragraph | Table | Heading

Parse paragraph (consecutive text lines), table, or setext heading.

If the second line is a setext underline (=== or ---), returns Heading. If tables are enabled and lines form a valid GFM table, returns Table. Otherwise returns Paragraph.

CommonMark: Ordered lists can only interrupt paragraphs if they start with 1.

Returns
Paragraph | Table | Heading
_is_setext_underline 1 bool
Check if line is a setext heading underline. Must be at least 1 character of =…
def _is_setext_underline(self, line: str) -> bool

Check if line is a setext heading underline.

Must be at least 1 character of = or - with optional trailing spaces. CommonMark allows up to 3 leading spaces.

Parameters
Name Type Description
line
Returns
bool

Functions

_process_escapes 1 str
Process backslash escapes in info strings. CommonMark: Backslash escapes work …
def _process_escapes(text: str) -> str

Process backslash escapes in info strings.

CommonMark: Backslash escapes work in code fence info strings.

Parameters
Name Type Description
text str
Returns
str
_extract_explicit_id 1 tuple[str, str | None]
Extract MyST-compatible explicit anchor ID from heading content. Syntax: ## Ti…
def _extract_explicit_id(content: str) -> tuple[str, str | None]

Extract MyST-compatible explicit anchor ID from heading content.

Syntax: ## Title {#custom-id}

The {#id} must be at the end of the content, preceded by whitespace. ID must start with a letter, contain only letters, numbers, hyphens, underscores.

Parameters
Name Type Description
content str

Heading content (already stripped)

Returns
tuple[str, str | None]