Scale your documentation with schemas, validation, remote sources, and CI/CD pipelines. Perfect for teams that need content governance.

Tip

Duration: ~75 min | Prerequisite: Experience managing documentation

1

Content Collections

Validate frontmatter with typed schemas

Content Collections

Define typed schemas for your content to ensure consistency and catch errors early.

Do I Need This?

No. Collections are optional. Your site works fine without them.

Use collections when:

  • You want typos caught at build time, not in production
  • Multiple people edit content and need guardrails
  • You want consistent frontmatter across content types

Quick Setup

bengal collections init

This createscollections.pyat your project root. Edit it to uncomment what you need:

1
2
3
4
5
6
from bengal.collections import define_collection, BlogPost, DocPage

collections = {
    "blog": define_collection(schema=BlogPost, directory="blog"),
    "docs": define_collection(schema=DocPage, directory="docs"),
}

Done. Build as normal—validation happens automatically.

Built-in Schemas

Bengal provides schemas for common content types:

Schema Required Fields Optional Fields
BlogPost title, date author, tags, draft, description, image
DocPage title weight, category, tags, toc, deprecated
APIReference title, endpoint method, version, auth_required, rate_limit
Tutorial title difficulty, duration, prerequisites, series
Changelog title, date version, breaking, summary

Import any of these:

from bengal.collections import BlogPost, DocPage, APIReference

Custom Schemas

Define your own using Python dataclasses:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
from dataclasses import dataclass, field
from datetime import datetime
from typing import Optional

@dataclass
class ProjectPage:
    title: str
    status: str  # "active", "completed", "archived"
    started: datetime
    tech_stack: list[str] = field(default_factory=list)
    github_url: Optional[str] = None

collections = {
    "projects": define_collection(
        schema=ProjectPage,
        directory="projects",
    ),
}

Validation Modes

By default, validation warns but doesn't fail builds:

⚠ content/blog/my-post.md
  └─ date: Required field 'date' is missing

Strict Mode

To fail builds on validation errors, add tobengal.toml:

1
2
[build]
strict_collections = true

Lenient Mode (Extra Fields)

To allow fields not in your schema:

1
2
3
4
5
define_collection(
    schema=BlogPost,
    directory="blog",
    strict=False,  # Allow extra frontmatter fields
)

CLI Commands

1
2
3
4
5
6
7
8
# List defined collections and their schemas
bengal collections list

# Validate content without building
bengal collections validate

# Validate specific collection
bengal collections validate --collection blog

Migration Tips

Existing site with inconsistent frontmatter?

  1. Start withstrict=Falseto allow extra fields
  2. Runbengal collections validateto find issues
  3. Fix content or adjust schema
  4. Switch tostrict=Truewhen ready

Transform legacy field names:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
def migrate_legacy(data: dict) -> dict:
    if "post_title" in data:
        data["title"] = data.pop("post_title")
    return data

collections = {
    "blog": define_collection(
        schema=BlogPost,
        directory="blog",
        transform=migrate_legacy,
    ),
}

Remote Content

Collections work with remote content too. Use a loader instead of a directory:

1
2
3
4
5
6
7
8
9
from bengal.collections import define_collection, DocPage
from bengal.content_layer import github_loader

collections = {
    "api-docs": define_collection(
        schema=DocPage,
        loader=github_loader(repo="myorg/api-docs", path="docs/"),
    ),
}

See Content Sources for GitHub, Notion, REST API loaders.

Seealso

2

Content Sources

Fetch content from external sources

Remote Content Sources

Fetch content from GitHub, Notion, REST APIs, and more.

Do I Need This?

No. By default, Bengal reads content from local files. That works for most sites.

Use remote sources when:

  • Your docs live in multiple GitHub repos
  • Content lives in a CMS (Notion, Contentful, etc.)
  • You want to pull API docs from a separate service
  • You need to aggregate content from different teams

Quick Start

Install the loader you need:

1
2
3
4
pip install bengal[github]   # GitHub repositories
pip install bengal[notion]   # Notion databases
pip install bengal[rest]     # REST APIs
pip install bengal[all-sources]  # Everything

Update yourcollections.py:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
from bengal.collections import define_collection, DocPage
from bengal.content_layer import github_loader

collections = {
    # Local content (default)
    "docs": define_collection(
        schema=DocPage,
        directory="content/docs",
    ),

    # Remote content from GitHub
    "api-docs": define_collection(
        schema=DocPage,
        loader=github_loader(
            repo="myorg/api-docs",
            path="docs/",
        ),
    ),
}

Build as normal. Remote content is fetched, cached, and validated like local content.

Available Loaders

GitHub

Fetch markdown from any GitHub repository:

1
2
3
4
5
6
7
8
from bengal.content_layer import github_loader

loader = github_loader(
    repo="owner/repo",       # Required: "owner/repo" format
    branch="main",           # Default: "main"
    path="docs/",            # Default: "" (root)
    token=None,              # Default: uses GITHUB_TOKEN env var
)

For private repos, setGITHUB_TOKENenvironment variable or passtokendirectly.

Notion

Fetch pages from a Notion database:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
from bengal.content_layer import notion_loader

loader = notion_loader(
    database_id="abc123...",  # Required: database ID from URL
    token=None,               # Default: uses NOTION_TOKEN env var
    property_mapping={        # Map Notion properties to frontmatter
        "title": "Name",
        "date": "Published",
        "tags": "Tags",
    },
)

Setup:

  1. Create integration at notion.so/my-integrations
  2. Share your database with the integration
  3. SetNOTION_TOKENenvironment variable

REST API

Fetch from any JSON API:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
from bengal.content_layer import rest_loader

loader = rest_loader(
    url="https://api.example.com/posts",
    headers={"Authorization": "Bearer ${API_TOKEN}"},  # Env vars expanded
    content_field="body",           # JSON path to content
    id_field="id",                  # JSON path to ID
    frontmatter_fields={            # Map API fields to frontmatter
        "title": "title",
        "date": "published_at",
        "tags": "categories",
    },
)

Local (Explicit)

For consistency, you can also use an explicit local loader:

1
2
3
4
5
6
7
from bengal.content_layer import local_loader

loader = local_loader(
    directory="content/docs",
    glob="**/*.md",
    exclude=["_drafts/*"],
)

Caching

Remote content is cached locally to avoid repeated API calls:

1
2
3
4
5
6
7
8
# Check cache status
bengal sources status

# Force refresh from remote
bengal sources fetch --force

# Clear all cached content
bengal sources clear

Cache behavior:

  • Default TTL: 1 hour
  • Cache directory:.bengal/content_cache/
  • Automatic invalidation when config changes
  • Falls back to cache if remote unavailable

CLI Commands

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# List configured content sources
bengal sources list

# Show cache status (age, size, validity)
bengal sources status

# Fetch/refresh from remote sources
bengal sources fetch
bengal sources fetch --source api-docs  # Specific source
bengal sources fetch --force            # Ignore cache

# Clear cached content
bengal sources clear
bengal sources clear --source api-docs

Environment Variables

Variable Used By Description
GITHUB_TOKEN GitHub loader Personal access token for private repos
NOTION_TOKEN Notion loader Integration token
Custom REST loader Any${VAR}in headers is expanded

Multi-Repo Documentation

A common pattern for large organizations:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
from bengal.collections import define_collection, DocPage
from bengal.content_layer import github_loader, local_loader

collections = {
    # Main docs (local)
    "docs": define_collection(
        schema=DocPage,
        directory="content/docs",
    ),

    # API reference (from API team's repo)
    "api": define_collection(
        schema=DocPage,
        loader=github_loader(repo="myorg/api-service", path="docs/"),
    ),

    # SDK docs (from SDK repo)
    "sdk": define_collection(
        schema=DocPage,
        loader=github_loader(repo="myorg/sdk", path="docs/"),
    ),
}

Custom Loaders

ImplementContentSourcefor any content origin:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
from bengal.content_layer import ContentSource, ContentEntry

class MyCustomSource(ContentSource):
    source_type = "my-api"

    async def fetch_all(self):
        for item in await self._get_items():
            yield ContentEntry(
                id=item["id"],
                slug=item["slug"],
                content=item["body"],
                frontmatter={"title": item["title"]},
                source_type=self.source_type,
                source_name=self.name,
            )

    async def fetch_one(self, id: str):
        item = await self._get_item(id)
        if not item:
            return None
        return ContentEntry(...)

Zero-Cost Design

If you don't use remote sources:

  • No extra dependencies installed
  • No network calls
  • No import overhead
  • No configuration needed

Remote loaders are lazy-loaded only when you import them.

Other Content Sources

  • Autodoc — Generate API docs from Python, CLI commands, and OpenAPI specs

Seealso

Section 3: Page Not Found

Could not find page:docs/extending/validation/_index

4

Configuration

Configuring Bengal with bengal.toml

Configuration

Control Bengal's behavior throughbengal.tomland environment-specific settings.

Configuration Methods

flowchart TB subgraph "Base Configuration (Mutually Exclusive)" A[bengal.toml] B[config/ directory] end C[Environment Overrides] D[CLI Flags] E[Final Config] A -.->|OR| E B -.->|OR| E C --> E D --> E

Bengal loads configuration from either theconfig/directory (preferred) ORbengal.toml(legacy/simple). Ifconfig/exists,bengal.tomlis ignored.

Overrides apply in order: Base Config → Environment Overrides → CLI Flags.

Quick Start

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# bengal.toml
[site]
title = "My Site"
base_url = "https://example.com"
language = "en"

[build]
output_dir = "public"
clean = true

[theme]
name = "default"

Configuration Patterns

Best for small sites:

1
2
3
4
5
6
7
8
9
# bengal.toml - everything in one place
[site]
title = "My Blog"

[build]
output_dir = "public"

[theme]
name = "default"

Best for larger sites:

config/
├── _default/
│   ├── site.yaml
│   ├── build.yaml
│   └── theme.yaml
└── environments/
    ├── production.yaml
    └── staging.yaml

Environment Overrides

Run with different settings per environment:

bengal build --environment production
1
2
3
4
5
6
7
# config/environments/production.yaml
site:
  base_url: "https://example.com"

build:
  minify: true
  fingerprint: true

Tip

Best practice: Keep development settings inbengal.toml, add production overrides inconfig/environments/production.yaml.

Build Options Reference

Key[build]configuration options:

Option Type Default Description
output_dir string "public" Directory for generated files
clean bool false Remove output directory before build
minify bool false Minify HTML/CSS/JS output
fingerprint bool false Add content hash to asset URLs
validate_templates bool false Proactive template syntax validation
validate_build bool true Post-build validation checks
validate_links bool true Check for broken internal links
strict_mode bool false Fail build on any error or warning

Template Validation

Enablevalidate_templatesto catch template syntax errors early during builds:

1
2
[build]
validate_templates = true

When enabled, Bengal validates all templates (HTML/XML) in your template directories before rendering. This provides early feedback on syntax errors, even for templates that might not be used by every page.

Enable template validation during development for immediate feedback:

1
2
[build]
validate_templates = true

Combine with strict mode in CI pipelines to fail builds on template errors:

1
2
3
[build]
validate_templates = true
strict_mode = true

When to enable:

  • During active theme development
  • In CI/CD pipelines
  • When debugging template issues

What it catches:

  • Jinja2 syntax errors (unclosed tags, invalid filters)
  • Unknown filter names
  • Template assertion errors

Note

Template validation adds a small overhead to build time. For large sites, consider enabling it only in development and CI environments.

5

Deployment

Deploy your Bengal site to production

Deploy Your Site

Bengal generates static HTML, CSS, and JavaScript files. This means you can host your site anywhere that serves static files (e.g., GitHub Pages, Netlify, Vercel, AWS S3, Nginx).

The Production Build

When you are ready to ship, run the build command:

bengal build --environment production

This command:

  • Loads configuration fromconfig/environments/production.yaml(if it exists)
  • Minifies assets (if enabled)
  • Generates thepublic/directory with your complete site

Common Build Flags

Flag Description Use Case
--environment production Loads production config overrides. Always use for shipping.
--strict Fails the build on warnings (e.g., broken links). Highly Recommended for CI/CD.
--clean-output Cleans thepublic/directory before building. Recommended to avoid stale files.
--verbose Shows detailed logs. Useful for debugging CI failures.

Example full command for CI:

bengal build --environment production --strict --clean-output

GitHub Pages

Deploy using GitHub Actions. Create.github/workflows/deploy.yml:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
name: Deploy to GitHub Pages

on:
  push:
    branches: [main]

permissions:
  contents: read
  pages: write
  id-token: write

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.14'

      - name: Install Bengal
        run: pip install bengal

      - name: Build Site
        run: bengal build --environment production --strict

      - name: Upload artifact
        uses: actions/upload-pages-artifact@v3
        with:
          path: './public'

  deploy:
    environment:
      name: github-pages
      url: ${{ steps.deployment.outputs.page_url }}
    runs-on: ubuntu-latest
    needs: build
    steps:
      - name: Deploy to GitHub Pages
        id: deployment
        uses: actions/deploy-pages@v4

Netlify

Create anetlify.tomlin your repository root:

1
2
3
4
5
6
[build]
  publish = "public"
  command = "bengal build --environment production"

[build.environment]
  PYTHON_VERSION = "3.14"

Vercel

Configure your project:

  1. Build Command:bengal build --environment production
  2. Output Directory:public
  3. Ensure yourrequirements.txtincludesbengal.

Environment Variables

Bengal allows you to inject environment variables into your configuration using{{ env.VAR_NAME }}syntax in your YAML/TOML config files.

config/environments/production.yaml:

1
2
3
params:
  api_key: "{{ env.API_KEY }}"
  analytics_id: "{{ env.ANALYTICS_ID }}"

Then setAPI_KEYandANALYTICS_IDin your hosting provider's dashboard.

Pre-Deployment Checklist

Before you merge to main or deploy:

  1. Runbengal config doctor: Checks for common configuration issues.
  2. Runbengal build --strictlocally: Ensures no broken links or missing templates.
  3. Checkconfig/environments/production.yaml: Ensure yourbaseurlis set to your production domain.
1
2
3
# config/environments/production.yaml
site:
  baseurl: "https://example.com"

Seealso

6

Automate with GitHub Actions

Set up automated builds, testing, and deployments using GitHub Actions

Automate with GitHub Actions

Set up continuous integration and deployment (CI/CD) for your Bengal site. Automate builds, run tests, and deploy to production with GitHub Actions.

When to Use This Guide

  • You want automated builds on every commit
  • You need to run tests before deployment
  • You want to deploy to production automatically
  • You're setting up preview deployments for pull requests
  • You need to validate content and links before publishing

Prerequisites

  • Bengal installed
  • A Git repository on GitHub
  • A hosting provider account (GitHub Pages, Netlify, Vercel, etc.)
  • Basic knowledge of YAML

Steps

  1. 1

    Basic Build Workflow

    Create.github/workflows/build.yml:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    name: Build Site
    
    on:
      push:
        branches: [main]
      pull_request:
        branches: [main]
    
    jobs:
      build:
        runs-on: ubuntu-latest
    
        steps:
          - name: Checkout code
            uses: actions/checkout@v4
    
          - name: Set up Python
            uses: actions/setup-python@v5
            with:
              python-version: '3.14'
    
          - name: Install Bengal
            run: pip install bengal
    
          - name: Build site
            run: bengal build --environment production --strict
    
          - name: Upload artifacts
            uses: actions/upload-artifact@v4
            with:
              name: site
              path: public/
              retention-days: 1
    
  2. 2

    Deploy to GitHub Pages

    Create.github/workflows/deploy.yml:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    name: Deploy to Production
    
    on:
      push:
        branches: [main]
    
    permissions:
      contents: read
      pages: write
      id-token: write
    
    jobs:
      build:
        runs-on: ubuntu-latest
        steps:
          - name: Checkout code
            uses: actions/checkout@v4
    
          - name: Set up Python
            uses: actions/setup-python@v5
            with:
              python-version: '3.14'
    
          - name: Install Bengal
            run: pip install bengal
    
          - name: Build site
            run: bengal build --environment production --strict --clean-output
    
          - name: Upload artifact
            uses: actions/upload-pages-artifact@v4
            with:
              path: './public'
    
      deploy:
        environment:
          name: github-pages
          url: ${{ steps.deployment.outputs.page_url }}
        runs-on: ubuntu-latest
        needs: build
        steps:
          - name: Deploy to GitHub Pages
            id: deployment
            uses: actions/deploy-pages@v4
    
  3. 3

    Preview Deployments

    Create.github/workflows/preview.yml:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    name: Preview Deployment
    
    on:
      pull_request:
        branches: [main]
    
    jobs:
      preview:
        runs-on: ubuntu-latest
        steps:
          - name: Checkout code
            uses: actions/checkout@v4
    
          - name: Set up Python
            uses: actions/setup-python@v5
            with:
              python-version: '3.14'
    
          - name: Install Bengal
            run: pip install bengal
    
          - name: Build site
            run: bengal build --environment preview --build-drafts
    
          - name: Comment PR with preview
            uses: actions/github-script@v7
            with:
              script: |
                github.rest.issues.createComment({
                  issue_number: context.issue.number,
                  owner: context.repo.owner,
                  repo: context.repo.repo,
                  body: '✅ Preview build successful! Artifacts available in workflow run.'
                })
    
  4. 4

    Add Validation and Testing

    Add health checks to your CI pipeline:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    # .github/workflows/test.yml
    name: Test and Validate
    
    on: [push, pull_request]
    
    jobs:
      validate:
        runs-on: ubuntu-latest
        steps:
          - uses: actions/checkout@v4
    
          - name: Set up Python
            uses: actions/setup-python@v5
            with:
              python-version: '3.14'
    
          - name: Install Bengal
            run: pip install bengal
    
          - name: Validate configuration
            run: bengal config doctor
    
          - name: Check for broken links
            run: bengal health linkcheck
    
          - name: Build with strict mode
            run: bengal build --strict --verbose
    
  5. 5

    Caching for Faster Builds

    Add caching to speed up workflows:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    - name: Cache pip packages
      uses: actions/cache@v4
      with:
        path: ~/.cache/pip
        key: ${{ runner.os }}-pip-${{ hashFiles('requirements.txt') }}
        restore-keys: |
          ${{ runner.os }}-pip-
    
    - name: Cache Bengal build cache
      uses: actions/cache@v4
      with:
        path: .bengal-cache
        key: ${{ runner.os }}-bengal-${{ github.sha }}
        restore-keys: |
          ${{ runner.os }}-bengal-
    
  6. 6

    Environment-Specific Builds

    Create Environment Configs

    config/environments/production.yaml:

    1
    2
    3
    4
    5
    site:
      baseurl: "https://example.com"
    
    params:
      analytics_id: "{{ env.GA_ID }}"
    

    config/environments/preview.yaml:

    1
    2
    3
    4
    5
    site:
      baseurl: "https://preview.example.com"
    
    params:
      analytics_id: ""  # Disable analytics in preview
    

    Use Environment Variables

    1
    2
    3
    env:
      GA_ID: ${{ secrets.GA_ID }}
      API_KEY: ${{ secrets.API_KEY }}
    

Alternative Platforms

GitLab CI

Create.gitlab-ci.yml:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
image: python:3.14

stages:
  - build
  - deploy

build:
  stage: build
  script:
    - pip install bengal
    - bengal build --environment production --strict
  artifacts:
    paths:
      - public/

pages:
  stage: deploy
  script:
    - pip install bengal
    - bengal build --environment production --strict
  artifacts:
    paths:
      - public
  only:
    - main

Netlify

Createnetlify.toml:

1
2
3
4
5
6
[build]
  publish = "public"
  command = "pip install bengal && bengal build --environment production --strict"

[build.environment]
  PYTHON_VERSION = "3.14"

Vercel

Createvercel.json:

1
2
3
4
5
{
  "buildCommand": "pip install bengal && bengal build --environment production",
  "outputDirectory": "public",
  "installCommand": "pip install bengal"
}

Troubleshooting

Next Steps

← Previous: Deployment
✓ Track Complete