# Build Cache URL: /docs/reference/architecture/core/cache/ Section: core Tags: core, caching, incremental-builds, performance, dependency-tracking, cache, compression, zstandard -------------------------------------------------------------------------------- Cache System Bengal implements an intelligent caching system that enables sub-second incremental rebuilds. How It Works The build cache (.bengal/cache.json.zst) tracks the state of your project to determine exactly what needs to be rebuilt. Cache files are compressed with Zstandard for 92-93% size reduction. flowchart TD Start[Start Build] --> Load[Load Cache] Load --> Detect[Detect Changes] Detect --> Config{Config Changed?} Config -->|Yes| Full[Full Rebuild] Config -->|No| Hash[Check File Hashes] Hash --> DepGraph[Query Dependency Graph] DepGraph --> Filter[Filter Work] Filter --> Render[Render Affected Pages] Render --> Update[Update Cache] Update --> Save[Save to Disk] Caching Strategies File Hashing Dependency Graph Inverted Index Change Detection We use SHA256 hashing to detect file changes. Content files (.md) Templates (.html, .jinja2) Config files (.toml) Assets (.css, .js) Impact Analysis We track relationships to know what to rebuild. Page → Template: If post.html changes, rebuild all blog posts. Tag → Pages: If python tag changes, rebuild tags/python/ page. Page → Partial: If header.html changes, rebuild everything. Taxonomy Lookup We store an inverted index of tags to avoid parsing all pages. Stored: tag_to_pages['python'] = ['post1.md', 'post2.md'] Benefit: O(1) lookup for taxonomy page generation. Zstandard Compression Bengal uses Zstandard (zstd) compression for all cache files, leveraging Python 3.14's new compression.zstd module (PEP 784). Performance Benefits Metric Before After Improvement Cache size (773 pages) 1.64 MB 99 KB 94% smaller Compression ratio 1x 12-14x 12-14x Cache load time ~5ms ~0.5ms 10x faster Cache save time ~3ms ~1ms 3x faster How It Works flowchart LR Data[Cache Data] --> JSON[JSON Serialize] JSON --> Zstd[Zstd Compress] Zstd --> File[.json.zst File] File2[.json.zst File] --> Decomp[Zstd Decompress] Decomp --> Parse[JSON Parse] Parse --> Data2[Cache Data] File Format Cache files use the .json.zst extension: .bengal/ ├── cache.json.zst # Main build cache (compressed) ├── taxonomy_index.json.zst # Tag/category index (compressed) ├── asset_deps.json.zst # Asset dependencies (compressed) └── page_metadata.json.zst # Page metadata (compressed) Backward Compatibility Bengal automatically handles migration: Read: Tries .json.zst first, falls back to .json Write: Always writes compressed .json.zst Migration: Old uncompressed caches are read and re-saved as compressed This means existing projects upgrade seamlessly—no manual migration needed. CI/CD Benefits Compressed caches significantly improve CI/CD workflows: 1 2 3 4 5# GitHub Actions - cache is 16x smaller to transfer - uses: actions/cache@v4 with: path: .bengal/ key: bengal-${{ hashFiles('content/**') }} Faster cache upload/download (100KB vs 1.6MB) Lower storage costs Faster build times in CI pipelines The "No Object References" Rule Architecture Principle Never persist object references across builds. The cache only stores: File paths (strings) Hashes (strings) Simple metadata (dicts/lists) This ensures cache stability. When a build starts, we load the cache and reconstruct the relationships with fresh live objects. Cacheable Protocol Bengal uses a Cacheable protocol to enforce type-safe cache contracts across all cacheable types. This ensures consistent serialization, prevents cache bugs, and enables compile-time validation. Protocol Definition 1 2 3 4 5 6 7 8 9 10 11 12@runtime_checkable class Cacheable(Protocol): """Protocol for types that can be cached to disk.""" def to_cache_dict(self) -> dict[str, Any]: """Return JSON-serializable data only.""" ... @classmethod def from_cache_dict(cls, data: dict[str, Any]) -> Cacheable: """Reconstruct object from data.""" ... Contract Requirements JSON Primitives Only: to_cache_dict() must return only JSON-serializable types (str, int, float, bool, None, list, dict) Type Conversion: Complex types must be converted: datetime → ISO-8601 string (via datetime.isoformat()) Path → str (via str(path)) set → sorted list (for stability) No Object References: Never serialize live objects (Page, Section, Asset). Use stable identifiers (usually string paths) instead. Round-trip Invariant: T.from_cache_dict(obj.to_cache_dict()) must reconstruct an equivalent object (== by fields) Stable Keys: Field names in to_cache_dict() are the contract. Adding/removing fields requires version bump in cache file. Types Implementing Cacheable Type Location Purpose PageCore bengal/core/page/page_core.py Cacheable page metadata (title, date, tags, etc.) TagEntry bengal/cache/taxonomy_index.py Taxonomy index entries IndexEntry bengal/cache/query_index.py Query index entries AssetDependencyEntry bengal/cache/asset_dependency_map.py Asset dependency tracking Example Implementation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25@dataclass class PageCore(Cacheable): source_path: str title: str date: datetime | None = None tags: list[str] = field(default_factory=list) def to_cache_dict(self) -> dict[str, Any]: """Serialize PageCore to cache-friendly dictionary.""" return { "source_path": self.source_path, "title": self.title, "date": self.date.isoformat() if self.date else None, "tags": self.tags, } @classmethod def from_cache_dict(cls, data: dict[str, Any]) -> PageCore: """Deserialize PageCore from cache dictionary.""" return cls( source_path=data["source_path"], title=data["title"], date=datetime.fromisoformat(data["date"]) if data.get("date") else None, tags=data.get("tags", []), ) Generic CacheStore Helper Bengal provides a generic CacheStore helper for type-safe cache operations: 1 2 3 4 5 6from bengal.cache.cache_store import CacheStore # Type-safe cache operations store = CacheStore[PageCore](cache_path) store.save([page1.core, page2.core]) # List of Cacheable objects entries = store.load() # Returns list[PageCore] Benefits Type Safety: Static type checkers (mypy) validate cache contracts at compile time Consistency: All cache entries follow the same serialization pattern Versioning: Built-in version checking for cache invalidation Safety: Prevents accidental pickling of complex objects that might break across versions Performance: Protocol has zero runtime overhead (structural typing) PageCore Serialization With PageCore, cache serialization is simplified: 1 2 3 4 5 6 7 8 9 10 11# Before: Manual field mapping (error-prone) cache_data = { "source_path": str(page.source_path), "title": page.title, "date": page.date.isoformat() if page.date else None, # ... 10+ more fields } # After: Single line using PageCore from dataclasses import asdict cache_data = asdict(page.core) # All cacheable fields serialized Runtime Validation The @runtime_checkable decorator allows isinstance() checks: 1 2 3 4 5from bengal.cache.cacheable import Cacheable if isinstance(obj, Cacheable): data = obj.to_cache_dict() # Safe to serialize However, static type checking via mypy is the primary validation method. See: bengal/cache/cacheable.py for full protocol definition and examples. -------------------------------------------------------------------------------- Metadata: - Author: lbliii - Word Count: 916 - Reading Time: 5 minutes