Classes
IndexEntry
dataclass
A single entry in a query index.
Represents one index key (e.g., 'blog' section, 'Jane Smith' auth…
IndexEntry
dataclass A single entry in a query index.
Represents one index key (e.g., 'blog' section, 'Jane Smith' author) and all pages that match that key.
Implements the Cacheable protocol for type-safe serialization.
CacheableAttributes
| Name | Type | Description |
|---|---|---|
key |
str |
Index key (e.g., 'blog', 'Jane Smith', '2024') |
page_paths |
list[str] |
List of page source paths (strings, not Page objects) |
metadata |
dict[str, Any] |
Extra data for display (e.g., section title, author email) |
updated_at |
str |
ISO timestamp of last update |
content_hash |
str |
Hash of page_paths for change detection |
Methods 4
to_cache_dict
Serialize to cache-friendly dictionary (Cacheable protocol).
to_cache_dict
def to_cache_dict(self) -> dict[str, Any]
Serialize to cache-friendly dictionary (Cacheable protocol).
Returns
dict[str, Any]
from_cache_dict
classmethod
Deserialize from cache dictionary (Cacheable protocol).
from_cache_dict
classmethod def from_cache_dict(cls, data: dict[str, Any]) -> IndexEntry
Deserialize from cache dictionary (Cacheable protocol).
Parameters 1
data |
dict[str, Any] |
Returns
IndexEntry
to_dict
Alias for to_cache_dict (backward compatibility).
to_dict
def to_dict(self) -> dict[str, Any]
Alias for to_cache_dict (backward compatibility).
Returns
dict[str, Any]
from_dict
staticmethod
Alias for from_cache_dict (backward compatibility).
from_dict
staticmethod def from_dict(data: dict[str, Any]) -> IndexEntry
Alias for from_cache_dict (backward compatibility).
Parameters 1
data |
dict[str, Any] |
Returns
IndexEntry
Internal Methods 2
__post_init__
Compute content hash if not provided.
__post_init__
def __post_init__(self) -> None
Compute content hash if not provided.
_compute_hash
Compute hash of page_paths for change detection.
_compute_hash
def _compute_hash(self) -> str
Compute hash of page_paths for change detection.
Returns
str
QueryIndex
abstract
Base class for queryable indexes.
Subclasses define:
- What to index (e.g., by_section, by_author,…
QueryIndex
abstract Base class for queryable indexes.
Subclasses define:
- What to index (e.g., by_section, by_author, by_tag)
- How to extract keys from pages
- Optionally: custom serialization logic
The base class handles:
- Index storage and persistence
- Incremental updates
- Change detection
- O(1) lookups
ABCMethods 10
extract_keys
Extract index keys from a page.
Returns list of (key, metadata) tuples. Can re…
extract_keys
def extract_keys(self, page: Page) -> list[tuple[str, dict[str, Any]]]
Extract index keys from a page.
Returns list of (key, metadata) tuples. Can return multiple keys for multi-valued indexes (e.g., multi-author papers, multiple tags).
Parameters 1
page |
Page |
Page to extract keys from |
Returns
List of (key, metadata) tupleslist[tuple[str, dict[str, Any]]]
—
update_page
Update index for a single page.
Handles:
- Removing page from old keys
- Addin…
update_page
def update_page(self, page: Page, build_cache: BuildCache) -> set[str]
Update index for a single page.
Handles:
- Removing page from old keys
- Adding page to new keys
- Tracking affected keys for incremental regeneration
Parameters 2
page |
Page |
Page to update |
build_cache |
BuildCache |
Build cache for dependency tracking |
Returns
Set of affected index keys (need regeneration)set[str]
—
remove_page
Remove page from all index entries.
remove_page
def remove_page(self, page_path: str) -> set[str]
Remove page from all index entries.
Parameters 1
page_path |
str |
Path to page source file |
Returns
Set of affected keysset[str]
—
get
Get page paths for index key (O(1) lookup).
get
def get(self, key: str) -> list[str]
Get page paths for index key (O(1) lookup).
Parameters 1
key |
str |
Index key |
Returns
List of page paths (copy, safe to modify)list[str]
—
keys
Get all index keys.
keys
def keys(self) -> list[str]
Get all index keys.
Returns
list[str]
has_changed
Check if index entry changed (for skip optimization).
Compares page_paths as s…
has_changed
def has_changed(self, key: str, page_paths: list[str]) -> bool
Check if index entry changed (for skip optimization).
Compares page_paths as sets (order doesn't matter for most use cases).
Parameters 2
key |
str |
Index key |
page_paths |
list[str] |
New list of page paths |
Returns
True if entry changed and needs regenerationbool
—
get_metadata
Get metadata for index key.
get_metadata
def get_metadata(self, key: str) -> dict[str, Any]
Get metadata for index key.
Parameters 1
key |
str |
Index key |
Returns
Metadata dict (empty if key not found)dict[str, Any]
—
save_to_disk
Persist index to disk.
save_to_disk
def save_to_disk(self) -> None
Persist index to disk.
clear
Clear all index data.
clear
def clear(self) -> None
Clear all index data.
stats
Get index statistics.
stats
def stats(self) -> dict[str, Any]
Get index statistics.
Returns
Dictionary with index statsdict[str, Any]
—
Internal Methods 5
__init__
Initialize query index.
__init__
def __init__(self, name: str, cache_path: Path)
Initialize query index.
Parameters 2
name |
str |
Index name (e.g., 'section', 'author') |
cache_path |
Path |
Path to cache file (e.g., .bengal/indexes/section_index.json) |
_load_from_disk
Load index from disk.
_load_from_disk
def _load_from_disk(self) -> None
Load index from disk.
_add_page_to_key
Add page to index key.
_add_page_to_key
def _add_page_to_key(self, key: str, page_path: str, metadata: dict[str, Any]) -> None
Add page to index key.
Parameters 3
key |
str |
Index key |
page_path |
str |
Path to page source file |
metadata |
dict[str, Any] |
Metadata to store with this entry |
_remove_page_from_key
Remove page from index key.
_remove_page_from_key
def _remove_page_from_key(self, key: str, page_path: str) -> None
Remove page from index key.
Parameters 2
key |
str |
Index key |
page_path |
str |
Path to page source file |
__repr__
String representation.
__repr__
def __repr__(self) -> str
String representation.
Returns
str