Module

`delegate`

LexerDelegate implementation using rosettes.

Enables Zero-Copy Lexer Handoff (ZCLH) by bridging Patitas coordinate handoff to Rosettes state-machine lexers.

Design Philosophy:

Zero-Copy Lexer Handoff (ZCLH) is a performance pattern where:

Coordinate Handoff: The markdown parser (Patitas) identifies fenced code blocks and records (start, end) positions
Zero Copy: Instead of extracting substrings, we pass the entire source string with start/end indices
Delegate Pattern: RosettesDelegate bridges the parser to the syntax highlighter without tight coupling

This eliminates string allocation for code content:

Traditional:code_block = source[start:end]→ allocates new string
ZCLH:tokenize(source, start=start, end=end)→ no allocation

Performance Impact:

For a 10KB markdown file with 50 code blocks:

Traditional extraction: ~5ms (50 allocations, GC pressure)
ZCLH: ~3ms (0 allocations for content)

Thread-Safety:

RosettesDelegate is stateless — all methods use only their arguments. Safe for concurrent use from multiple threads on Python 3.14t.

Integration:

This delegate is used by Patitas (the markdown parser) and Bengal (the static site generator) to highlight fenced code blocks.

Example:

>>> delegate = RosettesDelegate()
>>> source = "# Header\n```python\ndef foo(): pass\n```"
>>> # Parser identifies code block at positions 19-34
>>> if delegate.supports_language("python"):
...     tokens = list(delegate.tokenize_range(source, 19, 34, "python"))

See Also:

rosettes.get_lexer: Lexer lookup used internally
rosettes.lexers._state_machine: How start/end are handled

Classes

RosettesDelegate 2 ▼

LexerDelegate implementation using rosettes. Thread-safe: All state is local to method calls. Desi…

LexerDelegate implementation using rosettes.

Thread-safe: All state is local to method calls. Designed for Python 3.14t free-threading.

This class bridges markdown parsers (like Patitas) to Rosettes syntax highlighting, enabling zero-copy lexer handoff.

Methods

tokenize_range 4 Iterator[Token] ▼

Tokenize a range of source code using rosettes state-machine lexer. This is th…

def tokenize_range(self, source: str, start: int, end: int, language: str) -> Iterator[Token]

Tokenize a range of source code using rosettes state-machine lexer.

This is the core ZCLH method: the source string is passed by reference, and only the (start, end) range is tokenized. No substring allocation.

Performance: O(end - start) guaranteed. Zero allocations for code content. The lexer reads characters directly from source[start:end].

Parameters

Name	Type	Description
`source`	`—`	The complete source string (not just the code block).
`start`	`—`	Starting index of the code block in source.
`end`	`—`	Ending index (exclusive) of the code block.
`language`	`—`	Language name or alias (e.g., 'python', 'js').

Returns

Iterator[Token]

supports_language 1 bool ▼

Check if rosettes supports the given language. Use this before calling tokeniz…

def supports_language(self, language: str) -> bool

Check if rosettes supports the given language.

Use this before calling tokenize_range() to handle unsupported languages gracefully (e.g., fall back to plain text).

Parameters

Name	Type	Description
`language`	`—`	Language name or alias to check.

Returns

bool True if the language is supported, False otherwise.