Post-Processing

Post-processing runs after all pages are rendered and performs site-wide operations like sitemap generation, RSS feeds, and link validation.

Sitemap Generator (`bengal/postprocess/sitemap.py`)

Purpose

Generates XML sitemap for SEO

Features

Generates XML sitemap for SEO
Includes all pages with metadata (respectsvisibility.sitemap)
Version-aware priority (latest: 0.8, older: 0.3, default: 0.5)
i18n support withhreflangalternate links
Validates URL structure
Follows sitemap.xml protocol

Configuration

# Enable/disable sitemap generation (default: true)
generate_sitemap = true

Sitemap behavior is automatic:

Change frequency: Alwaysweekly
Priority: Computed from version status (latest versions get higher priority)
Exclusions: Pages withhidden: true or visibility.sitemap: falseare excluded

Output Format

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://example.com/</loc>
    <lastmod>2025-10-19</lastmod>
    <changefreq>weekly</changefreq>
    <priority>1.0</priority>
  </url>
  <url>
    <loc>https://example.com/blog/post/</loc>
    <lastmod>2025-10-18</lastmod>
    <changefreq>weekly</changefreq>
    <priority>0.8</priority>
  </url>
</urlset>

RSS Generator (`bengal/postprocess/rss.py`)

Purpose

Generates RSS feed for blog posts

Features

Generates RSS 2.0 feed
Includes 20 most recent posts with dates
Uses page description or excerpt (first 200 chars)
Sorted by date (newest first)
Respectsvisibility.rssand draft status
i18n support (per-locale feeds when enabled)

Configuration

# Enable/disable RSS generation (default: true)
generate_rss = true

RSS behavior is automatic:

Item limit: 20 most recent pages with dates
Content: Page description from frontmatter, or excerpt from content
Exclusions: Drafts and pages withvisibility.rss: falseare excluded

Output Format

<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>My Site</title>
    <link>https://example.com/</link>
    <description>A Bengal SSG site</description>
    <language>en</language>
    <lastBuildDate>Sat, 19 Oct 2025 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://example.com/rss.xml" rel="self" type="application/rss+xml"/>

    <item>
      <title>My Blog Post</title>
      <link>https://example.com/blog/my-post/</link>
      <guid>https://example.com/blog/my-post/</guid>
      <pubDate>Fri, 18 Oct 2025 00:00:00 +0000</pubDate>
      <description>Post description or excerpt</description>
    </item>
  </channel>
</rss>

Link Validator (`bengal/health/validators/links.py`)

Purpose

Validates internal and external links

Features

Validates internal links resolve to existing pages
Handles relative paths and fragments
Supports trailing slash variations
Caches validation results for performance
Integrates with health check system
External link checking handled separately viabengal health linkcheckcommand

Configuration

# Enable/disable internal link validation during build (default: true)
validate_links = true

Validation Process

Internal links (validated during build):

Extract links from rendered pages
Resolve relative paths against page URL
Check if target page exists in site
Report broken links with source context

External links (separate command):

bengal health linkcheck --external

Automatically skipped:

External URLs (http://, https://)
Special protocols (mailto:, tel:, data:)
Template syntax ({{, ${)
Source file references (.pyfiles from autodoc)

Output Example

Build-time validation reports internal broken links:

Found 2 broken internal links:
  content/blog/post.md: /docs/missing-page/
  content/index.md: /guide/old-section/

External link checking (via bengal health linkcheck):

Checking 156 external links...
✓ 153 links valid
✗ 3 broken:
  - https://example.com/404 (in /links.html)
  - https://old-api.example.com (in /docs/api.html)

Special Page Generation

404 Page

Generated automatically iftemplates/404.htmlexists

Search Index

JSON index for client-side search (if enabled)

Archive Pages

Chronological page listings by year/month

Parallel Post-Processing

Post-processing tasks run in parallel when multiple tasks are enabled:

# Actual implementation in bengal/orchestration/postprocess.py
with ThreadPoolExecutor(max_workers=len(tasks)) as executor:
    futures = {executor.submit(task_fn): name for name, task_fn in tasks}
    for future in as_completed(futures):
        future.result()  # Raises on error

Tasks that run in parallel:

Sitemap generation
RSS feed generation
Output formats (JSON, TXT, LLM)
Special pages (404, search)
Redirect pages
Social cards (if enabled)

Incremental builds: Skip expensive tasks (sitemap, RSS, social cards) for faster dev server response. Output formats always regenerate to keep search index current.

Post-Processing

Sitemap Generator (bengal/postprocess/sitemap.py)

Purpose

Features

Configuration

Output Format

RSS Generator (bengal/postprocess/rss.py)

Purpose

Features

Configuration

Output Format

Link Validator (bengal/health/validators/links.py)

Purpose

Features

Configuration

Validation Process

Output Example

Special Page Generation

404 Page

Search Index

Archive Pages

Parallel Post-Processing

Sitemap Generator (`bengal/postprocess/sitemap.py`)

RSS Generator (`bengal/postprocess/rss.py`)

Link Validator (`bengal/health/validators/links.py`)