Module

orchestration.postprocess

Post-processing orchestration for Bengal SSG.

Handles post-build tasks like sitemap generation, RSS feeds, link validation, and special page generation. Runs after all pages are rendered and coordinates parallel post-processing tasks.

Key Concepts:

  • Sitemap generation: XML sitemap for search engines
  • RSS feeds: RSS/Atom feed generation for blog content
  • Link validation: Broken link detection and reporting
  • Special pages: 404, robots.txt, and other generated pages
  • Output formats: JSON, TXT, LLM-friendly output generation

Related Modules:

  • bengal.postprocess.sitemap: Sitemap generation
  • bengal.postprocess.rss: RSS feed generation
  • bengal.postprocess.output_formats: Output format generators
  • bengal.health.validators: Link validation

See Also:

  • bengal/orchestration/postprocess.py: PostprocessOrchestrator for orchestration logic

Classes

PostprocessOrchestrator
Orchestrates post-processing tasks after page rendering. Handles sitemap generation, RSS feeds, li…
10

Orchestrates post-processing tasks after page rendering.

Handles sitemap generation, RSS feeds, link validation, special pages, and output format generation. Supports parallel execution for performance and incremental build optimization.

Creation:

Direct instantiation: PostprocessOrchestrator(site)
  • Created by BuildOrchestrator during build
  • Requires Site instance with rendered pages

Attributes

Name Type Description
site

Site instance with rendered pages and configuration

Relationships
  • Uses: SitemapGenerator for sitemap generation - Uses: RSSGenerator for RSS feed generation - Uses: OutputFormatsGenerator for JSON/TXT/LLM output - Uses: SpecialPagesGenerator for 404 and other special pages - Used by: BuildOrchestrator for post-processing phase Thread Safety: Thread-safe for parallel task execution. Uses thread-safe locks for output operations.

Methods 1

run
Perform post-processing tasks (sitemap, RSS, output formats, link validation, etc.).
4 None
def run(self, parallel: bool = True, progress_manager: Any | None = None, build_context: BuildContext | Any | None = None, incremental: bool = False) -> None

Perform post-processing tasks (sitemap, RSS, output formats, link validation, etc.).

Parameters 4
parallel bool

Whether to run tasks in parallel

progress_manager Any | None

Live progress manager (optional)

build_context BuildContext | Any | None
incremental bool

Whether this is an incremental build (can skip some tasks)

Internal Methods 9
__init__
Initialize postprocess orchestrator.
1 None
def __init__(self, site: Site)

Initialize postprocess orchestrator.

Parameters 1
site Site

Site instance with rendered pages and configuration

_run_sequential
Run post-processing tasks sequentially.
3 None
def _run_sequential(self, tasks: list[tuple[str, Callable[[], None]]], progress_manager: Any | None = None, reporter: Any | None = None) -> None

Run post-processing tasks sequentially.

Parameters 3
tasks list[tuple[str, Callable[[], None]]]

List of (task_name, task_function) tuples

progress_manager Any | None

Live progress manager (optional)

reporter Any | None
_run_parallel
Run post-processing tasks in parallel.
3 None
def _run_parallel(self, tasks: list[tuple[str, Callable[[], None]]], progress_manager: Any | None = None, reporter: Any | None = None) -> None

Run post-processing tasks in parallel.

Parameters 3
tasks list[tuple[str, Callable[[], None]]]

List of (task_name, task_function) tuples

progress_manager Any | None

Live progress manager (optional)

reporter Any | None
_generate_special_pages
Generate special pages like 404 (extracted for parallel execution).
1 None
def _generate_special_pages(self, build_context: BuildContext | Any | None = None) -> None

Generate special pages like 404 (extracted for parallel execution).

Parameters 1
build_context BuildContext | Any | None

Optional BuildContext with cached knowledge graph

_generate_sitemap
Generate sitemap.xml (extracted for parallel execution).
0 None
def _generate_sitemap(self) -> None

Generate sitemap.xml (extracted for parallel execution).

_generate_rss
Generate RSS feed (extracted for parallel execution).
0 None
def _generate_rss(self) -> None

Generate RSS feed (extracted for parallel execution).

_generate_redirects
Generate redirect pages for page aliases. Creates lightweight HTML redirect pa…
0 None
def _generate_redirects(self) -> None

Generate redirect pages for page aliases.

Creates lightweight HTML redirect pages at each alias URL that redirect to the canonical page location.

_build_graph_data
Build knowledge graph and return graph data for inclusion in page JSON. Uses b…
1 dict[str, Any] | None
def _build_graph_data(self, build_context: BuildContext | Any | None = None) -> dict[str, Any] | None

Build knowledge graph and return graph data for inclusion in page JSON.

Uses build_context.knowledge_graph if available to avoid rebuilding the graph multiple times per build.

Parameters 1
build_context BuildContext | Any | None

Optional BuildContext with cached knowledge graph

Returns

dict[str, Any] | None

Graph data dictionary or None if graph building fails or is disabled

_generate_output_formats
Generate custom output formats like JSON, plain text (extracted for parallel ex…
2 None
def _generate_output_formats(self, graph_data: dict[str, Any] | None = None, build_context: BuildContext | Any | None = None) -> None

Generate custom output formats like JSON, plain text (extracted for parallel execution).

Parameters 2
graph_data dict[str, Any] | None

Optional pre-computed graph data to include in page JSON

build_context BuildContext | Any | None

Optional BuildContext with accumulated JSON data from rendering phase