Module

utils.observability

Observability utilities for systematic stats collection across Bengal's build pipeline.

Provides standardized stats collection and formatting for debugging performance issues, cache effectiveness, and processing bottlenecks.

This module implements the observability improvements from RFC: rfc-observability-improvements.md

Key Concepts:

  • ComponentStats: Standardized stats container with counts, cache metrics, and sub-timings
  • HasStats: Protocol for components that expose observability stats
  • Consistent formatting for CLI output and structured logging

Usage:

>>> from bengal.utils.observability import ComponentStats, HasStats
>>> stats = ComponentStats(items_total=100, items_processed=80)
>>> stats.items_skipped["filtered"] = 20
>>> print(stats.format_summary("MyComponent"))
MyComponent: processed=80/100 | skipped=[filtered=20]

Related:

  • bengal/health/report.py: ValidatorStats (extends ComponentStats pattern)
  • bengal/orchestration/build/finalization.py: CLI output integration
  • plan/active/rfc-observability-improvements.md: Design rationale

Classes

HasStats
Protocol for components that expose observability stats. Components implementing this protocol can…
0

Protocol for components that expose observability stats.

Components implementing this protocol can have their stats displayed automatically when phases exceed performance thresholds.

Inherits from Protocol

Attributes

Name Type Description
last_stats ComponentStats | None
ComponentStats dataclass
Standardized stats container for any build component. Provides a uniform interface for tracking: -…
5

Standardized stats container for any build component.

Provides a uniform interface for tracking:

  • Processing counts (total, processed, skipped by reason)
  • Cache effectiveness (hits, misses, hit rate)
  • Sub-operation timings (analyze, render, validate, etc.)
  • Custom metrics (component-specific values)

Attributes

Name Type Description
items_total int

Total items to process

items_processed int

Items actually processed

items_skipped dict[str, int]

Dict of skip reasons and counts (e.g., {"autodoc": 450, "draft": 3})

cache_hits int

Number of cache hits (if applicable)

cache_misses int

Number of cache misses (if applicable)

sub_timings dict[str, float]

Dict of sub-operation names to duration_ms

metrics dict[str, int | float | str]

Custom metrics (component-specific, e.g., {"pages_per_sec": 375})

Methods 5

cache_hit_rate property
Cache hit rate as percentage (0-100).
float
def cache_hit_rate(self) -> float

Cache hit rate as percentage (0-100).

Returns

float

Percentage of cache hits, or 0.0 if no cache operations.

skip_rate property
Skip rate as percentage (0-100).
float
def skip_rate(self) -> float

Skip rate as percentage (0-100).

Returns

float

Percentage of items skipped, or 0.0 if no items.

total_skipped property
Total number of skipped items across all reasons.
int
def total_skipped(self) -> int

Total number of skipped items across all reasons.

Returns

int

format_summary
Format stats for CLI output. Produces a compact, informative summary suitable …
1 str
def format_summary(self, name: str = '') -> str

Format stats for CLI output.

Produces a compact, informative summary suitable for terminal display. Only includes sections with actual data.

Parameters 1
name str

Component name prefix (e.g., "Directives", "Links")

Returns

str

Formatted string like "processed=80/100 | skipped=[autodoc=450] | cache=80/80 (100%)"

to_log_context
Convert to flat dict for structured logging. Flattens nested data structures f…
0 dict[str, int | flo…
def to_log_context(self) -> dict[str, int | float | str]

Convert to flat dict for structured logging.

Flattens nested data structures for log aggregation systems.

Returns

dict[str, int | float | str]

Flat dictionary suitable for structured logging kwargs.

Functions

format_phase_stats
Format stats for a slow phase, if applicable. Returns formatted stats string only if the phase exc…
4 str | None
def format_phase_stats(phase_name: str, duration_ms: float, component: HasStats | None, slow_threshold_ms: float = 1000) -> str | None

Format stats for a slow phase, if applicable.

Returns formatted stats string only if the phase exceeded the threshold AND the component has stats available.

Parameters 4

Name Type Default Description
phase_name str

Name of the phase (e.g., "Directives", "Links")

duration_ms float

How long the phase took

component HasStats | None

Component with HasStats protocol (or None)

slow_threshold_ms float 1000

Threshold for considering a phase "slow"

Returns

str | None

Formatted stats string, or None if phase was fast or no stats available.