Health Check System

Bengal includes a comprehensive health check system that validates source content and author-facing policy. Generated artifact correctness is now split out intobengal audit, while rendering owns the URL and anchor registries used by link validation.

Validation Surfaces

Surface	Command/API	Owner	Purpose
Source policy checks	`bengal check`	`bengal/health/`	Config, directives, navigation, taxonomy, connectivity, and author-facing link checks
Compatibility alias	`bengal health`	`bengal/cli/`	Legacy entrypoint for`bengal check`while automation migrates
Artifact audit	`bengal audit`	`bengal/audit/`	Post-build scan of generated HTML references and output files
Reference truth	`bengal/rendering/reference_registry.py`	`bengal/rendering/`	Rendered URLs, source paths, anchors, and auxiliary output URLs

Health Check (`bengal/health/health_check.py`)

Purpose: Orchestrates validators and produces unified health reports
Features:
- Modular validator architecture
- Fast execution (< 100ms per validator)
- Configurable per-validator enable/disable
- Console and JSON report formats
- Integration with build stats

Usage:

  from bengal.health import HealthCheck

  health = HealthCheck(site)
  report = health.run(build_stats=stats)
  print(report.format_console())

Base Validator (`bengal/health/base.py`)

Purpose: Abstract base class for all validators
Interface:validate(site, build_context=None) -> list[CheckResult]
Features:
- Independent execution (no validator dependencies)
- Error handling and crash recovery
- Performance tracking per validator
- Configuration-based enablement
- Access to cached build artifacts viabuild_context

Health Report (`bengal/health/report/`)

Purpose: Unified reporting structure for health check results
Components:
- CheckStatus: SUCCESS, INFO, SUGGESTION, WARNING, ERROR (ordered by severity)
- CheckResult: Individual check result with recommendation
- ValidatorReport: Results from a single validator
- HealthReport: Aggregated report from all validators
- ReportEnvelope: Versioned CLI/machine envelope for Milo and Kida output
Formats:
- Console output (colored, progressive disclosure)
- JSON output (machine-readable)
- Versioned result envelope (bengal.check.v1)
- Summary statistics (pass/warning/error counts)
- Quality scoring (0-100 with ratings)

Artifact Audit (`bengal/audit/`)

Artifact audit is intentionally cheaper and more orthogonal than a full health check. It scans the generated output directory after a build, extractshref andsrcreferences from HTML, skips external protocols, and verifies that internal references resolve to files, clean URL directories, or.htmlfiles in the output tree.

bengal build
bengal audit
bengal audit --json

Audit output uses the bengal.audit.v1envelope and renders through the same Kida validation report template as source checks.

Remediation (`bengal/health/remediation/`)

The remediation subpackage provides automated fixes for common validation errors:

Purpose: Generate and apply fixes from health check results
Components:
- AutoFixer: Framework for generating and applying fixes
- FixAction: Single fix with metadata and application logic
- FixSafety: Safety classification (SAFE, CONFIRM, UNSAFE)

Usage:

  from bengal.health.remediation import AutoFixer, FixSafety

  fixer = AutoFixer(report, site_root=site.root_path)
  fixes = fixer.suggest_fixes()
  safe_fixes = [f for f in fixes if f.safety == FixSafety.SAFE]
  results = fixer.apply_fixes(safe_fixes)

Validators (`bengal/health/validators/`)

Validators are registered in phases based on execution cost and dependencies.

Phase 1 - Core Validation: | Validator | Validates | |-----------|-----------| | ConfigValidatorWrapper | Configuration validity, essential fields, common issues | | URLCollisionValidator | Duplicate URL detection (catches conflicts early) | | OwnershipPolicyValidator | URL ownership and content governance |

Phase 2 - Content Validation: | Validator | Validates | |-----------|-----------| | RenderingValidator | HTML quality, unrendered template syntax, SEO/social metadata, JSON-LD syntax | | DirectiveValidator | Directive syntax, completeness, and performance | | NavigationValidator | Page navigation (next/prev, breadcrumbs, ancestors) | | MenuValidator | Menu structure integrity, circular reference detection | | TaxonomyValidator | Tags, categories, archives, pagination integrity | | TrackValidator | Learning track structure and progression | | LinkValidatorWrapper | Broken links detection (internal and external) | | AnchorValidator | Explicit anchor targets and cross-reference integrity |

Phase 3 - Advanced Validation: | Validator | Validates | |-----------|-----------| | CacheValidator | Incremental build cache integrity and consistency | | PerformanceValidator | Build performance metrics and bottleneck detection |

Phase 4 - Production-Ready Validation: | Validator | Validates | |-----------|-----------| | RSSValidator | RSS feed quality, XML validity, URL formatting | | SitemapValidator | Sitemap.xml validity for SEO, no duplicate URLs | | FontValidator | Font downloads, CSS generation, file sizes | | AssetValidator | Asset optimization, minification hints, size analysis |

Phase 5 - Knowledge Graph Validation: | Validator | Validates | |-----------|-----------| | ConnectivityValidator | Page connectivity using semantic link model and weighted scoring |

Specialized Validators (not auto-registered): | Validator | Validates | |-----------|-----------| | AutodocValidator | API documentation HTML structure validation | | OutputValidator | Page sizes, asset presence (redundant with build errors) | | CrossReferenceValidator | Internal cross-reference resolution | | AccessibilityValidator | WCAG compliance and accessibility checks | | AssetURLValidator | Asset URL resolution and validation |

Utility Classes (not BaseValidator subclasses): | Class | Purpose | |-------|---------| | TemplateValidator | Jinja2 template syntax validation (requires TemplateEngine) |

Connectivity Validator

The Connectivity Validator uses a semantic link model with weighted scoring to provide nuanced page connectivity analysis beyond binary orphan detection.

Link Types and Weights: | Link Type | Weight | Description | |-----------|--------|-------------| | Explicit | 1.0 | Human-authored markdown links | | Menu | 10.0 | Navigation menu items (high editorial intent) | | Taxonomy | 1.0 | Shared tags/categories | | Related | 0.75 | Algorithm-computed related posts | | Topical | 0.5 | Section hierarchy (parent → child) | | Sequential | 0.25 | Next/prev navigation |

Connectivity levels: | Level | Score Range | Status | |-------|-------------|--------| | Well-connected | ≥ 2.0 | No action needed | | Adequately linked | 1.0 - 2.0 | Could improve | | Lightly linked | 0.25 - 1.0 | Should improve (only structural links) | | Isolated | < 0.25 | Needs attention |

Configuration:

[health_check]
# Connectivity thresholds
isolated_threshold = 5      # Max isolated pages before error
lightly_linked_threshold = 20  # Max lightly-linked before warning

# Customize weights (optional)
[health_check.link_weights]
explicit = 1.0
menu = 10.0
taxonomy = 1.0
related = 0.75
topical = 0.5
sequential = 0.25

CLI Commands:

# Full connectivity report
bengal graph report

# List isolated pages
bengal graph orphans --level isolated

# List lightly-linked pages
bengal graph orphans --level lightly

# CI mode with exit code
bengal graph report --ci --threshold-isolated 5

Configuration

Health checks use a tiered validation system for optimal performance:

Tier	Name	Time	Trigger	Validators
1	`build`	<100ms	Always	Config, URL Collisions, Rendering, Directives, Navigation, Menu, Taxonomy
2	`full`	~500ms	`--full`flag	+ Connectivity, Cache, Performance, Anchors
3	`ci`	~30s	`--ci`flag or CI env	+ External link checking (LinkValidatorWrapper)

Configuration viabengal.toml:

[health_check]
enabled = true
verbose = false
strict_mode = false

# Connectivity thresholds
isolated_threshold = 5        # Max isolated pages before error
lightly_linked_threshold = 20 # Max lightly-linked before warning

# Connectivity score thresholds
[health_check.connectivity_thresholds]
well_connected = 2.0    # Score >= 2.0
adequately_linked = 1.0 # Score 1.0-2.0
lightly_linked = 0.25   # Score 0.25-1.0
# Score < 0.25 = isolated

# Link type weights for scoring
[health_check.link_weights]
explicit = 1.0    # Human-authored markdown links
menu = 10.0       # Navigation menu items
taxonomy = 1.0    # Shared tags/categories
related = 0.75    # Algorithm-computed related posts
topical = 0.5     # Section hierarchy (parent → child)
sequential = 0.25 # Next/prev navigation

Per-profile validator filtering:

Validators run based on the active build profile:

Profile	Validators Enabled
`writer`	Config, Menu (fast feedback)
`theme-dev`	+ Rendering, Directives
`dev`	All validators (full observability)

Validators can be explicitly enabled/disabled in config regardless of profile.

Integration

Health checks run automatically after builds in strict mode and can be triggered manually:

# Automatic validation in strict mode
site.config["strict_mode"] = True
stats = site.build()

# Manual validation
from bengal.health import HealthCheck
health = HealthCheck(site)
report = health.run(build_stats=stats)

Health Check System

Validation Surfaces

Health Check (bengal/health/health_check.py)

Base Validator (bengal/health/base.py)

Health Report (bengal/health/report/)

Artifact Audit (bengal/audit/)

Remediation (bengal/health/remediation/)

Validators (bengal/health/validators/)