Module

cache.page_discovery_cache

Page Discovery Cache for incremental builds.

Caches page metadata (title, date, tags, section, slug) to enable lazy loading of full page content. This allows incremental builds to skip discovery of unchanged pages and only load full content when needed.

Architecture:

  • Metadata: source_path → PageMetadata (minimal data needed for navigation/filtering)
  • Lazy Loading: Full content loaded on first access via PageProxy
  • Storage: .bengal/page_metadata.json (compact format)
  • Validation: Hash-based validation to detect stale cache entries

Performance Impact:

  • Full page discovery skipped for unchanged pages (~80ms saved per 100 pages)
  • Lazy loading ensures correctness (full content available when needed)
  • Incremental builds only load changed pages fresh

Classes

PageDiscoveryCacheEntry dataclass
Cache entry with metadata and validity information.
2

Cache entry with metadata and validity information.

Attributes

Name Type Description
metadata PageMetadata
cached_at str
is_valid bool

Methods 2

to_dict
0 dict[str, Any]
def to_dict(self) -> dict[str, Any]
Returns

dict[str, Any]

from_dict staticmethod
1 PageDiscoveryCacheEntry
def from_dict(data: dict[str, Any]) -> PageDiscoveryCacheEntry
Parameters 1
data dict[str, Any]
Returns

PageDiscoveryCacheEntry

PageDiscoveryCache
Persistent cache for page metadata enabling lazy page loading. Purpose: - Store page metadata (tit…
13

Persistent cache for page metadata enabling lazy page loading.

Purpose:

  • Store page metadata (title, date, tags, section, slug)
  • Enable incremental discovery (only load changed pages)
  • Support lazy loading of full page content on demand
  • Validate cache entries to detect stale data

Cache Format (JSON): { "pages": { "content/index.md": { "metadata": { "source_path": "content/index.md", "title": "Home", ... }, "cached_at": "2025-10-16T12:00:00", "is_valid": true } } }

Note: If cache format changes, load will fail and cache rebuilds automatically.

Methods 11

save_to_disk
Save cache to disk.
0 None
def save_to_disk(self) -> None

Save cache to disk.

has_metadata
Check if metadata is cached for a page.
1 bool
def has_metadata(self, source_path: Path) -> bool

Check if metadata is cached for a page.

Parameters 1
source_path Path

Path to source file

Returns

bool

True if valid metadata exists in cache

get_metadata
Get cached metadata for a page.
1 PageMetadata | None
def get_metadata(self, source_path: Path) -> PageMetadata | None

Get cached metadata for a page.

Parameters 1
source_path Path

Path to source file

Returns

PageMetadata | None

PageMetadata if found and valid, None otherwise

add_metadata
Add or update metadata in cache.
1 None
def add_metadata(self, metadata: PageMetadata) -> None

Add or update metadata in cache.

Parameters 1
metadata PageMetadata

PageMetadata to cache

invalidate
Mark a cache entry as invalid.
1 None
def invalidate(self, source_path: Path) -> None

Mark a cache entry as invalid.

Parameters 1
source_path Path

Path to source file to invalidate

invalidate_all
Invalidate all cache entries.
0 None
def invalidate_all(self) -> None

Invalidate all cache entries.

clear
Clear all cache entries.
0 None
def clear(self) -> None

Clear all cache entries.

get_valid_entries
Get all valid cached metadata entries.
0 dict[str, PageMetadata]
def get_valid_entries(self) -> dict[str, PageMetadata]

Get all valid cached metadata entries.

Returns

dict[str, PageMetadata]

Dictionary mapping source_path to PageMetadata for valid entries

get_invalid_entries
Get all invalid cached metadata entries.
0 dict[str, PageMetadata]
def get_invalid_entries(self) -> dict[str, PageMetadata]

Get all invalid cached metadata entries.

Returns

dict[str, PageMetadata]

Dictionary mapping source_path to PageMetadata for invalid entries

validate_entry
Validate a cache entry against current file hash.
2 bool
def validate_entry(self, source_path: Path, current_file_hash: str) -> bool

Validate a cache entry against current file hash.

Parameters 2
source_path Path

Path to source file

current_file_hash str

Current hash of source file

Returns

bool

True if cache entry is valid (hash matches), False otherwise

stats
Get cache statistics.
0 dict[str, int]
def stats(self) -> dict[str, int]

Get cache statistics.

Returns

dict[str, int]

Dictionary with cache stats (total, valid, invalid)

Internal Methods 2
__init__
Initialize cache.
1 None
def __init__(self, cache_path: Path | None = None)

Initialize cache.

Parameters 1
cache_path Path | None

Path to cache file (defaults to .bengal/page_metadata.json)

_load_from_disk
Load cache from disk if it exists.
0 None
def _load_from_disk(self) -> None

Load cache from disk if it exists.