Bengal implements an intelligent caching system that enables sub-second incremental rebuilds.
How It Works
The build cache (.bengal/cache.json.zst) tracks the state of your project to determine exactly what needs to be rebuilt. Cache files are compressed with Zstandard for 92-93% size reduction.
Caching Strategies
Change Detection
We use SHA256 hashing to detect file changes.
- Content files (
.md) - Templates (
.html,.jinja2) - Config files (
.toml) - Assets (
.css,.js)
Impact Analysis
We track relationships to know what to rebuild.
- Page → Template: If
post.htmlchanges, rebuild all blog posts. - Tag → Pages: If
pythontag changes, rebuildtags/python/page. - Page → Partial: If
header.htmlchanges, rebuild everything.
Taxonomy Lookup
We store an inverted index of tags to avoid parsing all pages.
- Stored:
tag_to_pages['python'] = ['post1.md', 'post2.md'] - Benefit: O(1) lookup for taxonomy page generation.
Zstandard Compression
Bengal uses Zstandard (zstd) compression for all cache files, leveraging Python 3.14's newcompression.zstdmodule (PEP 784).
Performance Benefits
| Metric | Before | After | Improvement |
|---|---|---|---|
| Cache size (773 pages) | 1.64 MB | 99 KB | 94% smaller |
| Compression ratio | 1x | 12-14x | 12-14x |
| Cache load time | ~5ms | ~0.5ms | 10x faster |
| Cache save time | ~3ms | ~1ms | 3x faster |
How It Works
File Format
Cache files use the.json.zstextension:
.bengal/
├── cache.json.zst # Main build cache (compressed)
├── taxonomy_index.json.zst # Tag/category index (compressed)
├── asset_deps.json.zst # Asset dependencies (compressed)
└── page_metadata.json.zst # Page metadata (compressed)
Backward Compatibility
Bengal automatically handles migration:
- Read: Tries
.json.zstfirst, falls back to.json - Write: Always writes compressed
.json.zst - Migration: Old uncompressed caches are read and re-saved as compressed
This means existing projects upgrade seamlessly—no manual migration needed.
CI/CD Benefits
Compressed caches significantly improve CI/CD workflows:
1 2 3 4 5 | |
- Faster cache upload/download (100KB vs 1.6MB)
- Lower storage costs
- Faster build times in CI pipelines
The "No Object References" Rule
Never persist object references across builds.
The cache only stores:
- File paths (strings)
- Hashes (strings)
- Simple metadata (dicts/lists)
This ensures cache stability. When a build starts, we load the cache and reconstruct the relationships with fresh live objects.
Cacheable Protocol
Bengal uses aCacheableprotocol to enforce type-safe cache contracts across all cacheable types. This ensures consistent serialization, prevents cache bugs, and enables compile-time validation.
Protocol Definition
1 2 3 4 5 6 7 8 9 10 11 12 | |
Contract Requirements
- JSON Primitives Only:
to_cache_dict()must return only JSON-serializable types (str, int, float, bool, None, list, dict) - Type Conversion: Complex types must be converted:
datetime→ ISO-8601 string (viadatetime.isoformat())Path→ str (viastr(path))set→ sorted list (for stability)
- No Object References: Never serialize live objects (Page, Section, Asset). Use stable identifiers (usually string paths) instead.
- Round-trip Invariant:
T.from_cache_dict(obj.to_cache_dict())must reconstruct an equivalent object (== by fields) - Stable Keys: Field names in
to_cache_dict()are the contract. Adding/removing fields requires version bump in cache file.
Types Implementing Cacheable
| Type | Location | Purpose |
|---|---|---|
PageCore |
bengal/core/page/page_core.py |
Cacheable page metadata (title, date, tags, etc.) |
TagEntry |
bengal/cache/taxonomy_index.py |
Taxonomy index entries |
IndexEntry |
bengal/cache/query_index.py |
Query index entries |
AssetDependencyEntry |
bengal/cache/asset_dependency_map.py |
Asset dependency tracking |
Example Implementation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | |
Generic CacheStore Helper
Bengal provides a genericCacheStorehelper for type-safe cache operations:
1 2 3 4 5 6 | |
Benefits
- Type Safety: Static type checkers (mypy) validate cache contracts at compile time
- Consistency: All cache entries follow the same serialization pattern
- Versioning: Built-in version checking for cache invalidation
- Safety: Prevents accidental pickling of complex objects that might break across versions
- Performance: Protocol has zero runtime overhead (structural typing)
PageCore Serialization
With PageCore, cache serialization is simplified:
1 2 3 4 5 6 7 8 9 10 11 | |
Runtime Validation
The@runtime_checkabledecorator allowsisinstance()checks:
1 2 3 4 5 | |
However, static type checking via mypy is the primary validation method.
See:bengal/cache/cacheable.pyfor full protocol definition and examples.