# Performance

URL: /docs/about/performance/
Section: about
Tags: performance, benchmarks

--------------------------------------------------------------------------------

# Performance

Rosettes is designed for predictable, high performance. State machine lexers provide O(n) time complexity with no worst-case surprises.

## Benchmarks vs Pygments

Tested on a 10,000-line Python file:

| Operation | Rosettes | Pygments | Speedup |
|-----------|----------|----------|---------|
| Tokenize | 12ms | 45ms | **3.75x** |
| Highlight | 18ms | 52ms | **2.89x** |
| Parallel (8 blocks) | 22ms | 48ms | **2.18x** |

*Benchmarked on Apple M1 Pro, Python 3.14. Results vary by hardware—run `python -m benchmarks.benchmark_vs_pygments` to measure on your system.*

---

## Time Complexity

### O(n) Guaranteed

Rosettes processes each character exactly once:

| Input Size | Time |
|------------|------|
| 1,000 chars | ~0.1ms |
| 10,000 chars | ~1ms |
| 100,000 chars | ~10ms |
| 1,000,000 chars | ~100ms |

Linear scaling—no exponential blowup.

### Comparison with Regex

Regex-based highlighters can exhibit catastrophic backtracking:

```
Pattern: (a+)+$
Input:   "aaaaaaaaaaaaaaaaaaaaaaaaaaaa!"

Regex:   Exponential time (2^n attempts)
Rosettes: Linear time (n character reads)
```

---

## Memory Usage

Rosettes uses minimal memory:

| Component | Memory |
|-----------|--------|
| Lexer instance | ~1 KB |
| Token | 72 bytes |
| 10,000 tokens | ~720 KB |

Tokens are `NamedTuple`s—lightweight and cache-friendly.

---

## Optimization Tips

### Use `highlight_many()` for Multiple Blocks

For 8+ code blocks, parallel processing is faster:

```python
# Slow: sequential
results = [highlight(code, lang) for code, lang in blocks]

# Fast: parallel (for 8+ blocks)
results = highlight_many(blocks)
```

| Blocks | Sequential | Parallel | Speedup |
|--------|------------|----------|---------|
| 4 | 10ms | 12ms | 0.83x (overhead) |
| 8 | 20ms | 15ms | 1.33x |
| 50 | 125ms | 70ms | 1.79x |
| 100 | 250ms | 130ms | 1.92x |

### Skip Line Features When Not Needed

Line numbers and line highlighting use the slower code path:

```python
# Fast path (no line features)
html = highlight(code, "python")

# Slow path (line features enabled)
html = highlight(code, "python", show_linenos=True)
html = highlight(code, "python", hl_lines={1, 2, 3})
```

The difference is ~15% for typical code blocks.

### Reuse Lexer Instances

Lexers are cached automatically:

```python
from rosettes import get_lexer

# Same instance returned (cached)
lexer1 = get_lexer("python")
lexer2 = get_lexer("python")
assert lexer1 is lexer2  # True
```

No need to manually cache lexers.

---

## Parallel Scaling

### GIL Python (3.13 and earlier)

With the GIL, parallel highlighting provides limited benefit:

| Workers | Speedup |
|---------|---------|
| 1 | 1.0x |
| 2 | 1.1x |
| 4 | 1.15x |
| 8 | 1.2x |

The GIL prevents true parallelism, but I/O overlapping provides some benefit.

### Free-Threading (3.14t)

With free-threading enabled, true parallelism is achieved:

| Workers | Speedup |
|---------|---------|
| 1 | 1.0x |
| 2 | 1.8x |
| 4 | 3.2x |
| 8 | 4.5x |

Near-linear scaling up to 4 workers, then diminishing returns due to memory bandwidth.

---

## Profiling

Profile your highlighting with `cProfile`:

```python
import cProfile
from rosettes import highlight

code = open("large_file.py").read()

cProfile.run('highlight(code, "python")', sort="cumtime")
```

Or use `timeit` for quick benchmarks:

```python
import timeit
from rosettes import highlight

code = "def foo(): pass\n" * 10000

time = timeit.timeit(
    lambda: highlight(code, "python"),
    number=100,
)
print(f"Average: {time/100*1000:.2f}ms")
```

---

## Comparison Table

| Feature | Rosettes | Pygments |
|---------|----------|----------|
| Time complexity | O(n) | O(n) typical, O(2^n) worst |
| ReDoS vulnerable | No | Yes (some lexers) |
| Parallel support | Native | Manual only |
| Free-threading | Optimized | Not tested |
| Memory per token | 72 bytes | ~200 bytes |
| Dependencies | None | None |

---

## Next Steps

- [[docs/about/comparison|Comparison]] — Detailed Rosettes vs Pygments comparison
- [[docs/highlighting/parallel|Parallel Processing]] — Using `highlight_many()`

--------------------------------------------------------------------------------

Metadata:
- Word Count: 625
- Reading Time: 3 minutes