Rosettes is thread-safe by design, with explicit support for Python's free-threading mode (PEP 703, available in 3.13t+).
Thread-Safe Guarantees
All public APIs are safe for concurrent use:
| Component | Thread Safety Mechanism |
|---|---|
highlight() |
Uses only local variables |
tokenize() |
Uses only local variables |
highlight_many() |
Thread pool with isolated workers |
Token |
ImmutableNamedTuple |
get_lexer() |
functools.cachememoization |
How It Works
- 1
Immutable tokens
The
Tokentype is aNamedTuple, which is immutable:class Token(NamedTuple): type: TokenType value: str line: int = 1 column: int = 1Tokens cannot be modified after creation, eliminating data races.
- 2
Local-only lexer state
Lexers use only local variables during tokenization:
def tokenize(self, code: str) -> Iterator[Token]: # All state is local state = State.INITIAL pos = 0 buffer = [] while pos < len(code): # Process character ...No instance variables or global state are modified during tokenization.
- 3
Cached registry
The lexer registry uses a two-layer design with
functools.cachefor thread-safe memoization:def get_lexer(name: str) -> StateMachineLexer: """Public API - normalizes name, delegates to cached loader.""" canonical = _normalize_name(name) return _get_lexer_by_canonical(canonical) @cache def _get_lexer_by_canonical(canonical: str) -> StateMachineLexer: """Internal cached loader - lazily imports and instantiates.""" spec = _LEXER_SPECS[canonical] module = import_module(spec.module) return getattr(module, spec.class_name)()This provides thread-safe memoization—the same lexer instance is returned for the same name across all threads. Lexers are loaded lazily on first access.
- 4
Immutable configuration
All configuration classes are frozen dataclasses with slots for memory efficiency:
@dataclass(frozen=True, slots=True) class FormatConfig: css_class: str = "highlight" wrap_code: bool = True class_prefix: str = "" data_language: str | None = None
Free-Threading Support (PEP 703)
Rosettes declares itself safe for free-threaded Python via the_Py_mod_gilattribute:
def __getattr__(name: str) -> object:
if name == "_Py_mod_gil":
return 0 # Py_MOD_GIL_NOT_USED
raise AttributeError(f"module 'rosettes' has no attribute {name!r}")
This tells free-threaded Python (3.13t+) that Rosettes:
- Does not require the GIL
- Can run with true parallelism
- Is safe for concurrent access without locks
Concurrent Usage Patterns
Safe: Multiple Threads Highlighting
from concurrent.futures import ThreadPoolExecutor
from rosettes import highlight
def highlight_page(content: str) -> str:
# Extract and highlight all code blocks
return highlight(content, "python")
with ThreadPoolExecutor(max_workers=4) as executor:
pages = ["code1", "code2", "code3", "code4"]
results = list(executor.map(highlight_page, pages))
Safe: Shared Lexer Instance
from rosettes import get_lexer
# Same instance returned (cached)
lexer = get_lexer("python")
# Safe to use from multiple threads
def process(code: str) -> list:
return list(lexer.tokenize(code))
Safe: highlight_many()
from rosettes import highlight_many
# Designed for parallel execution
blocks = [(code, lang) for code, lang in code_blocks]
results = highlight_many(blocks) # Thread pool internally
What NOT to Do
Don't: Modify Tokens
# Tokens are immutable - this fails
token = Token(TokenType.KEYWORD, "def")
token.value = "class" # ❌ AttributeError
Don't: Rely on Global State
# Don't do this - Rosettes has no global mutable state
import rosettes
rosettes.SOME_SETTING = True # ❌ No effect, not supported
Performance on Free-Threaded Python
On free-threaded Python (3.13t+),highlight_many()provides true parallelism:
| Scenario | GIL Python | Free-Threading | Speedup |
|---|---|---|---|
| 10 blocks | 15ms | 12ms | 1.25x |
| 50 blocks | 75ms | 42ms | 1.78x |
| 100 blocks | 150ms | 78ms | 1.92x |
Numbers are illustrative. Actual performance varies by hardware, code complexity, and Python version. See Performance for benchmarking details.
The speedup comes from true parallel execution without GIL contention.
Verifying Free-Threading
Check if you're running free-threaded Python:
import sys
if hasattr(sys, "_is_gil_enabled"):
if sys._is_gil_enabled():
print("GIL is enabled")
else:
print("Free-threading active!")
else:
print("Python < 3.13 (always has GIL)")
Next Steps
- Parallel Processing — Using
highlight_many() - Performance — Benchmarks and optimization