Rosettes is thread-safe by design, with explicit support for Python's free-threading mode (PEP 703, available in 3.13t+).
Thread-Safe Guarantees
All public APIs are safe for concurrent use:
| Component | Thread Safety Mechanism |
|---|---|
highlight() |
Uses only local variables |
tokenize() |
Uses only local variables |
highlight_many() |
Thread pool with isolated workers |
Token |
ImmutableNamedTuple |
get_lexer() |
functools.cachememoization |
How It Works
1. Immutable Tokens
TheTokentype is aNamedTuple, which is immutable:
class Token(NamedTuple):
type: TokenType
value: str
line: int = 1
column: int = 1
Tokens cannot be modified after creation, eliminating data races.
2. Local-Only Lexer State
Lexers use only local variables during tokenization:
def tokenize(self, code: str) -> Iterator[Token]:
# All state is local
state = State.INITIAL
pos = 0
buffer = []
while pos < len(code):
# Process character
...
No instance variables or global state are modified during tokenization.
3. Cached Registry
The lexer registry uses a two-layer design withfunctools.cachefor thread-safe memoization:
def get_lexer(name: str) -> StateMachineLexer:
"""Public API - normalizes name, delegates to cached loader."""
canonical = _normalize_name(name)
return _get_lexer_by_canonical(canonical)
@cache
def _get_lexer_by_canonical(canonical: str) -> StateMachineLexer:
"""Internal cached loader - lazily imports and instantiates."""
spec = _LEXER_SPECS[canonical]
module = import_module(spec.module)
return getattr(module, spec.class_name)()
This provides thread-safe memoization—the same lexer instance is returned for the same name across all threads. Lexers are loaded lazily on first access.
4. Immutable Configuration
All configuration classes are frozen dataclasses with slots for memory efficiency:
@dataclass(frozen=True, slots=True)
class FormatConfig:
css_class: str = "highlight"
wrap_code: bool = True
class_prefix: str = ""
data_language: str | None = None
Free-Threading Support (PEP 703)
Rosettes declares itself safe for free-threaded Python via the_Py_mod_gilattribute:
def __getattr__(name: str) -> object:
if name == "_Py_mod_gil":
return 0 # Py_MOD_GIL_NOT_USED
raise AttributeError(f"module 'rosettes' has no attribute {name!r}")
This tells free-threaded Python (3.13t+) that Rosettes:
- Does not require the GIL
- Can run with true parallelism
- Is safe for concurrent access without locks
Concurrent Usage Patterns
Safe: Multiple Threads Highlighting
from concurrent.futures import ThreadPoolExecutor
from rosettes import highlight
def highlight_page(content: str) -> str:
# Extract and highlight all code blocks
return highlight(content, "python")
with ThreadPoolExecutor(max_workers=4) as executor:
pages = ["code1", "code2", "code3", "code4"]
results = list(executor.map(highlight_page, pages))
Safe: Shared Lexer Instance
from rosettes import get_lexer
# Same instance returned (cached)
lexer = get_lexer("python")
# Safe to use from multiple threads
def process(code: str) -> list:
return list(lexer.tokenize(code))
Safe: highlight_many()
from rosettes import highlight_many
# Designed for parallel execution
blocks = [(code, lang) for code, lang in code_blocks]
results = highlight_many(blocks) # Thread pool internally
What NOT to Do
Don't: Modify Tokens
# Tokens are immutable - this fails
token = Token(TokenType.KEYWORD, "def")
token.value = "class" # ❌ AttributeError
Don't: Rely on Global State
# Don't do this - Rosettes has no global mutable state
import rosettes
rosettes.SOME_SETTING = True # ❌ No effect, not supported
Performance on Free-Threaded Python
On free-threaded Python (3.13t+),highlight_many()provides true parallelism:
| Scenario | GIL Python | Free-Threading | Speedup |
|---|---|---|---|
| 10 blocks | 15ms | 12ms | 1.25x |
| 50 blocks | 75ms | 42ms | 1.78x |
| 100 blocks | 150ms | 78ms | 1.92x |
Numbers are illustrative. Actual performance varies by hardware, code complexity, and Python version. See Performance for benchmarking details.
The speedup comes from true parallel execution without GIL contention.
Verifying Free-Threading
Check if you're running free-threaded Python:
import sys
if hasattr(sys, "_is_gil_enabled"):
if sys._is_gil_enabled():
print("GIL is enabled")
else:
print("Free-threading active!")
else:
print("Python < 3.13 (always has GIL)")
Next Steps
- Parallel Processing — Using
highlight_many() - Performance — Benchmarks and optimization