Module

cache.build_cache.parsed_content_cache

Parsed content caching mixin for BuildCache.

Provides methods for caching parsed markdown content (HTML, TOC, AST) to skip re-parsing when only templates change. Optimization #2 from the cache RFC.

Key Concepts:

  • Caches rendered HTML (post-markdown, pre-template)
  • Caches TOC and structured TOC items
  • Optionally caches true AST for parse-once, use-many patterns
  • Validates against metadata, template, and parser version

Related Modules:

  • bengal.cache.build_cache.core: Main BuildCache class
  • bengal.rendering.pipeline: Markdown parsing pipeline
  • plan/active/rfc-content-ast-architecture.md: AST caching RFC

Classes

ParsedContentCacheMixin
Mixin providing parsed content caching (Optimization #2). Requires these attributes on the host cl…
5

Mixin providing parsed content caching (Optimization #2).

Requires these attributes on the host class:

  • parsed_content: dict[str, dict[str, Any]]
  • dependencies: dict[str, set[str]]
  • is_changed: Callable[[Path], bool] (from FileTrackingMixin)

Attributes

Name Type Description
parsed_content dict[str, dict[str, Any]]
dependencies dict[str, set[str]]

Methods 5

is_changed
Check if file changed (provided by FileTrackingMixin).
1 bool
def is_changed(self, file_path: Path) -> bool

Check if file changed (provided by FileTrackingMixin).

Parameters 1
file_path Path
Returns

bool

store_parsed_content
Store parsed content in cache (Optimization #2 + AST caching). This allows ski…
8 None
def store_parsed_content(self, file_path: Path, html: str, toc: str, toc_items: list[dict[str, Any]], metadata: dict[str, Any], template: str, parser_version: str, ast: list[dict[str, Any]] | None = None) -> None

Store parsed content in cache (Optimization #2 + AST caching).

This allows skipping markdown parsing when only templates change, resulting in 20-30% faster builds in that scenario.

Phase 3 Enhancement (RFC-content-ast-architecture):

  • Also caches the true AST for parse-once, use-many patterns
  • AST enables faster TOC/link extraction and plain text generation

RFC: rfc-incremental-hot-reload-invariants Phase 3:

  • Also caches nav_metadata_hash for fine-grained section index change detection
Parameters 8
file_path Path

Path to source file

html str

Rendered HTML (post-markdown, pre-template)

toc str

Table of contents HTML

toc_items list[dict[str, Any]]

Structured TOC data

metadata dict[str, Any]

Page metadata (frontmatter)

template str

Template name used

parser_version str

Parser version string (e.g., "mistune-3.0-toc2")

ast list[dict[str, Any]] | None

True AST tokens from parser (optional, for Phase 3)

get_parsed_content
Get cached parsed content if valid (Optimization #2). Validates that: 1. Conte…
4 dict[str, Any] | None
def get_parsed_content(self, file_path: Path, metadata: dict[str, Any], template: str, parser_version: str) -> dict[str, Any] | None

Get cached parsed content if valid (Optimization #2).

Validates that:

  1. Content file hasn't changed (via file_hashes)
  2. Metadata hasn't changed (via metadata_hash)
  3. Template hasn't changed (via template name)
  4. Parser version matches (avoid incompatibilities)
  5. Template file hasn't changed (via dependencies)
Parameters 4
file_path Path

Path to source file

metadata dict[str, Any]

Current page metadata

template str

Current template name

parser_version str

Current parser version

Returns

dict[str, Any] | None

Cached data dict if valid, None if invalid or not found

invalidate_parsed_content
Remove cached parsed content for a file.
1 None
def invalidate_parsed_content(self, file_path: Path) -> None

Remove cached parsed content for a file.

Parameters 1
file_path Path

Path to file

get_parsed_content_stats
Get parsed content cache statistics.
0 dict[str, Any]
def get_parsed_content_stats(self) -> dict[str, Any]

Get parsed content cache statistics.

Returns

dict[str, Any]

Dictionary with cache stats