Basic Usage - Rosettes

The two primary functions for syntax highlighting.

`highlight()`

Generate HTML with syntax-highlighted code.

from rosettes import highlight

html = highlight("def hello(): pass", "python")

# Use terminal output
ansi = highlight("def hello(): pass", "python", formatter="terminal")

Parameters

Parameter	Type	Default	Description
`code`	`str`	required	Source code to highlight
`language`	`str`	required	Language name or alias
`formatter`	`str \| Formatter`	`"html"`	Formatter name or instance
`hl_lines`	`set[int]`	`None`	1-based line numbers to highlight
`show_linenos`	`bool`	`False`	Include line numbers
`css_class`	`str`	`None`	Container CSS class (HTML only)
`css_class_style`	`str`	`"semantic"`	`"semantic"` or `"pygments"`(HTML only)

Language Aliases

Languages accept multiple aliases:

# These are equivalent
highlight(code, "python")
highlight(code, "py")
highlight(code, "python3")

# JavaScript aliases
highlight(code, "javascript")
highlight(code, "js")

CSS Class Styles

Semantic (default) — readable class names:

html = highlight(code, "python")  # css_class_style="semantic"
# <span class="syntax-keyword">def</span>
# <span class="syntax-function">hello</span>

Pygments — compatible with Pygments themes:

html = highlight(code, "python", css_class_style="pygments")
# <span class="k">def</span>
# <span class="nf">hello</span>

Container Class

The output is wrapped in a container<div>:

# Default: "rosettes" for semantic, "highlight" for pygments
html = highlight(code, "python")
# <div class="rosettes" data-language="python">...

html = highlight(code, "python", css_class_style="pygments")
# <div class="highlight" data-language="python">...

# Custom class
html = highlight(code, "python", css_class="my-code")
# <div class="my-code" data-language="python">...

`tokenize()`

Get raw tokens without formatting. Useful for custom output formats or analysis.

from rosettes import tokenize

tokens = tokenize("x = 42", "python")
for token in tokens:
    print(f"{token.type.name}: {token.value!r}")

Output:

NAME: 'x'
WHITESPACE: ' '
OPERATOR: '='
WHITESPACE: ' '
NUMBER_INTEGER: '42'

Token Structure

Each token is aNamedTuplewith:

Attribute	Type	Description
`type`	`TokenType`	Semantic token type
`value`	`str`	The actual text
`line`	`int`	1-based line number
`column`	`int`	1-based column number

token = tokens[0]
print(token.type)    # TokenType.NAME
print(token.value)   # 'x'
print(token.line)    # 1
print(token.column)  # 1

Use Cases

Custom formatters: Build terminal, LaTeX, or other output formats
Analysis: Count tokens, find patterns, compute metrics
Testing: Verify lexer behavior
Transformations: Modify code based on token structure

Error Handling

Both functions raiseLookupErrorfor unsupported languages:

from rosettes import highlight, supports_language

# Check before highlighting
if supports_language("python"):
    html = highlight(code, "python")

# Or handle the exception
try:
    html = highlight(code, "unknown")
except LookupError as e:
    print(f"Unsupported language: {e}")

Next Steps

Parallel Processing —highlight_many()for multiple blocks
Line Highlighting — Highlight specific lines
CSS Classes — Style your output