Module

cache.cache_store

Generic cache storage for Cacheable types.

This module provides a type-safe, generic cache storage mechanism that works with any type implementing the Cacheable protocol. It handles:

  • JSON serialization/deserialization
  • Zstandard compression (92-93% size reduction)
  • Version management (tolerant loading)
  • Directory creation
  • Type-safe load/save operations
  • Backward compatibility (reads both compressed and uncompressed)

Design Philosophy:

CacheStore provides a reusable cache storage layer that works with any
Cacheable type. This eliminates the need for each cache (TaxonomyIndex,
AssetDependencyMap, etc.) to implement its own save/load logic.

Benefits:
  • Consistent version handling across all caches
  • Type-safe operations (mypy validates)
  • Tolerant loading (returns empty on mismatch, doesn't crash)
  • Automatic directory creation
  • Single source of truth for cache file format
  • 12-14x compression ratio with Zstandard (PEP 784)

Usage Example:

from bengal.cache.cache_store import CacheStore
from bengal.cache.taxonomy_index import TagEntry

# Create store (compression enabled by default)
store = CacheStore(Path('.bengal/tags.json'))

# Save entries (type-safe, compressed)
tags = [
    TagEntry(tag_slug='python', tag_name='Python', page_paths=[], updated_at='...'),
    TagEntry(tag_slug='web', tag_name='Web', page_paths=[], updated_at='...'),
]
store.save(tags, version=1)

# Load entries (auto-detects format: .json.zst or .json)
loaded_tags = store.load(TagEntry, expected_version=1)
# Returns [] if file missing or version mismatch

See Also:

  • bengal/cache/cacheable.py - Cacheable protocol definition
  • bengal/cache/compression.py - Zstandard compression utilities
  • bengal/cache/taxonomy_index.py - Example usage (TagEntry)
  • bengal/cache/asset_dependency_map.py - Example usage (AssetDependencyEntry)
  • plan/active/rfc-zstd-cache-compression.md - Compression RFC

Classes

CacheStore
Generic cache storage for types implementing the Cacheable protocol. Provides type-safe save/load …
6

Generic cache storage for types implementing the Cacheable protocol.

Provides type-safe save/load operations with version management, Zstandard compression, and tolerant loading (returns empty list on version mismatch or missing file).

Attributes

Name Type Description
cache_path

Path to cache file (e.g., .bengal/taxonomy_index.json)

compress

Whether to use Zstandard compression (default: True) Cache File Format:

Compressed

Zstd-compressed JSON with same structure as below 92-93% smaller, 12-14x compression ratio

Uncompressed

{ "version": 1, "entries": [ {...}, // Serialized Cacheable objects {...} ] } Version Management: - Each cache file has a top-level "version" field - On version mismatch, load() returns empty list and logs warning - On missing file, load() returns empty list (no warning) - On malformed data, load() returns empty list and logs error This "tolerant loading" approach ensures that stale or incompatible caches don't crash the build - they just rebuild from scratch.

Compression
  • Enabled by default (Python 3.14+ with compression.zstd) - 92-93% size reduction on typical cache files - <1ms compress time, <0.3ms decompress time - Auto-detects format on load (reads both .json.zst and .json) - Backward compatible: reads old uncompressed caches Type Safety: - save() accepts list of any Cacheable type - load() requires explicit type parameter for deserialization - mypy validates that type implements Cacheable protocol

Methods 4

save
Save entries to cache file. Serializes all entries to JSON and writes to cache…
3 None
def save(self, entries: list[Cacheable], version: int = 1, indent: int = 2) -> None

Save entries to cache file.

Serializes all entries to JSON and writes to cache file. Creates parent directory if missing. Uses Zstandard compression by default.

Parameters 3
entries list[Cacheable]

List of Cacheable objects to save

version int

Cache version number (default: 1). Increment when format changes (new fields, removed fields, etc.)

indent int

JSON indentation (default: 2). Only used when compression is disabled; compressed files always use compact JSON.

load
Load entries from cache file (tolerant). Deserializes entries and validates ve…
2 list[T]
def load(self, entry_type: type[T], expected_version: int = 1) -> list[T]

Load entries from cache file (tolerant).

Deserializes entries and validates version. Automatically detects format (compressed .json.zst or uncompressed .json). If version mismatch or file missing, returns empty list (doesn't crash).

This "tolerant loading" approach ensures that builds never fail due to stale or incompatible caches - they just rebuild from scratch.

Parameters 2
entry_type type[T]

Type to deserialize (must implement Cacheable protocol). Used to call from_cache_dict() classmethod.

expected_version int

Expected cache version (default: 1). If file version doesn't match, returns empty list.

Returns

list[T]

List of deserialized entries, or [] if:

  • File doesn't exist (no warning, normal for first build)
  • Version mismatch (warning logged)
  • Malformed data (error logged)
  • Deserialization fails (error logged)

exists
Check if cache file exists (compressed or uncompressed).
0 bool
def exists(self) -> bool

Check if cache file exists (compressed or uncompressed).

Returns

bool

True if cache file exists in either format, False otherwise

clear
Delete cache file if it exists (both compressed and uncompressed). Used to for…
0 None
def clear(self) -> None

Delete cache file if it exists (both compressed and uncompressed).

Used to force cache rebuild (e.g., after format changes).

Internal Methods 2
__init__
Initialize cache store.
2 None
def __init__(self, cache_path: Path, compress: bool = True)

Initialize cache store.

Parameters 2
cache_path Path

Path to cache file (e.g., .bengal/taxonomy_index.json). Parent directory will be created if missing. With compression enabled, actual file will be .json.zst

compress bool

Whether to use Zstandard compression (default: True). Set to False for debugging or compatibility.

_load_data
Load raw data from cache file with auto-detection. Tries compressed format fir…
0 dict[Any, Any] | None
def _load_data(self) -> dict[Any, Any] | None

Load raw data from cache file with auto-detection.

Tries compressed format first (.json.zst), falls back to uncompressed (.json). This enables seamless migration from old uncompressed caches.

Returns

dict[Any, Any] | None

Parsed data dict, or None if file not found or load failed