Classes
HighlightItem
4
▼
Item for highlight_many() with optional line highlighting.
Use for blocks that need hl_lines or sh…
HighlightItem
4
▼
Item for highlight_many() with optional line highlighting.
Use for blocks that need hl_lines or show_linenos. Simple (code, lang) tuples remain supported for backward compatibility.
Attributes
| Name | Type | Description |
|---|---|---|
code |
str
|
— |
language |
str
|
— |
hl_lines |
frozenset[int] | None
|
— |
show_linenos |
bool
|
— |
Functions
content_hash
4
str
▼
Compute deterministic hash for (code, language, hl_lines, show_linenos) for cac…
content_hash
4
str
▼
def content_hash(code: str, language: str, hl_lines: frozenset[int] | set[int] | list[int] | None = None, show_linenos: bool = False) -> str
Compute deterministic hash for (code, language, hl_lines, show_linenos) for cache keys.
Use for block-level caching: same inputs always yield same hash. No normalization — whitespace changes produce different hashes (correct for cache invalidation).
Parameters
| Name | Type | Description |
|---|---|---|
code |
str |
Source code string. |
language |
str |
Language identifier. |
hl_lines |
frozenset[int] | set[int] | list[int] | None |
Optional set of 1-based line numbers to highlight. Default:None
|
show_linenos |
bool |
Whether line numbers are shown. Default:False
|
Returns
str
highlight
9
str
▼
Highlight source code and return formatted output.
This is the primary high-le…
highlight
9
str
▼
def highlight(code: str, language: str, formatter: str | Formatter = 'html', *, hl_lines: set[int] | frozenset[int] | None = None, show_linenos: bool = False, css_class: str | None = None, css_class_style: Literal['semantic', 'pygments', 'semantic-hybrid'] = 'semantic', start: int = 0, end: int | None = None) -> str
Highlight source code and return formatted output.
This is the primary high-level API for syntax highlighting. Thread-safe and suitable for concurrent use.
All lexers are hand-written state machines with O(n) guaranteed performance and zero ReDoS vulnerability.
Parameters
| Name | Type | Description |
|---|---|---|
code |
str |
The source code to highlight. |
language |
str |
Language name or alias (e.g., 'python', 'py', 'js'). |
formatter |
str | Formatter |
Formatter name ('html', 'terminal', 'null') or instance. Default:'html'
|
hl_lines |
set[int] | frozenset[int] | None |
Optional set of 1-based line numbers to highlight. Default:None
|
show_linenos |
bool |
If True, include line numbers in output. Default:False
|
css_class |
str | None |
Base CSS class for the code container (HTML only). Defaults to "rosettes" for semantic style, "highlight" for pygments. Default:None
|
css_class_style |
Literal['semantic', 'pygments', 'semantic-hybrid'] |
Class naming style (HTML only): - "semantic" (default): Uses readable classes like .syntax-function - "semantic-hybrid": Role + token-type classes (e.g. .syntax-function .syntax-name-builtin) - "pygments": Uses Pygments-compatible classes like .nf Default:'semantic'
|
start |
int |
Starting index in the source string. Default:0
|
end |
int | None |
Optional ending index in the source string. Default:None
|
Returns
str
tokenize
4
list[Token]
▼
Tokenize source code without formatting.
Useful for analysis, custom formattin…
tokenize
4
list[Token]
▼
def tokenize(code: str, language: str, start: int = 0, end: int | None = None) -> list[Token]
Tokenize source code without formatting.
Useful for analysis, custom formatting, or testing. Thread-safe.
All lexers are hand-written state machines with O(n) guaranteed performance and zero ReDoS vulnerability.
Parameters
| Name | Type | Description |
|---|---|---|
code |
str |
The source code to tokenize. |
language |
str |
Language name or alias. |
start |
int |
Starting index in the source string. Default:0
|
end |
int | None |
Optional ending index in the source string. Default:None
|
Returns
list[Token]
highlight_many
4
list[str]
▼
Highlight multiple code blocks in parallel.
This is the recommended way to hig…
highlight_many
4
list[str]
▼
def highlight_many(items: Iterable[HighlightItemInput], *, formatter: str | Formatter = 'html', max_workers: int | None = None, css_class_style: Literal['semantic', 'pygments', 'semantic-hybrid'] = 'semantic') -> list[str]
Highlight multiple code blocks in parallel.
This is the recommended way to highlight many code blocks concurrently. On Python 3.14t (free-threaded), this provides true parallelism. On GIL Python, it still provides benefits via I/O overlapping.
Thread-safe by design: each lexer uses only local variables.
Parameters
| Name | Type | Description |
|---|---|---|
items |
Iterable[HighlightItemInput] |
Iterable of (code, language) tuples or HighlightItem instances. HighlightItem supports hl_lines and show_linenos. |
formatter |
str | Formatter |
Formatter name or instance. Default:'html'
|
max_workers |
int | None |
Maximum number of threads. Defaults to min(4, CPU count), which benchmarking shows to be optimal. Default:None
|
css_class_style |
Literal['semantic', 'pygments', 'semantic-hybrid'] |
Class naming style (HTML only). Default:'semantic'
|
Returns
list[str]
tokenize_many
2
list[list[Token]]
▼
Tokenize multiple code blocks in parallel.
Similar to highlight_many() but ret…
tokenize_many
2
list[list[Token]]
▼
def tokenize_many(items: Iterable[tuple[str, str]], *, max_workers: int | None = None) -> list[list[Token]]
Tokenize multiple code blocks in parallel.
Similar to highlight_many() but returns raw tokens instead of HTML. Useful for analysis, custom formatting, or when you need token data.
Thread-safe by design: each lexer uses only local variables.
Parameters
| Name | Type | Description |
|---|---|---|
items |
Iterable[tuple[str, str]] |
Iterable of (code, language) tuples. |
max_workers |
int | None |
Maximum number of threads. Defaults to min(4, CPU count). Default:None
|
Returns
list[list[Token]]