Large Site Optimization

Bengal is designed for sites with thousands of pages. This guide covers strategies for sites beyond 5,000 pages.

Quick Start

For sites with 5K+ pages:

BASH

# Memory-optimized build
bengal build --memory-optimized --fast

# Full incremental + parallel + fast
bengal build --incremental --fast

Strategy Overview

Site Size	Recommended Strategy	Build Time
<500 pages	Default (no changes needed)	1-3s
500-5K pages	Default (parallel + incremental enabled)	3-15s
5K-20K pages	`--memory-optimized`	15-60s
20K+ pages	Full optimization stack	1-5min

1. Memory-Optimized Builds (Streaming Mode)

Warning

Experimental.--memory-optimizedis an experimental streaming mode. Its memory benefit is not yet verified by Bengal's benchmarks, and on the workloads measured so far it has not reduced peak memory below a standard build. Treat it as a tuning knob to try and measure on your own site, not a guaranteed win. Profile your build (see Build Profiling) before and after to confirm any improvement.

For sites with 5K+ pages, you can try streaming mode:

BASH

bengal build --memory-optimized

How It Works

Builds knowledge graph to understand page connectivity
Renders hubs first (highly connected pages) and keeps them in memory
Streams leaves in batches and releases memory immediately

The intent is to keep memory closer to constant as page count grows by releasing leaf pages after they render. Whether this lowers peak memory on a given site depends on its content and link structure — measure to confirm.

When to Try It

Very large sites (5K+ pages) where a standard build runs out of memory
CI runners or containers with strict memory limits

In each case, compare peak memory with and without the flag before relying on it.

Warning

--memory-optimized and --perf-profilecannot be used together (profiler doesn't work with batched rendering).

2. Query Indexes (O(1) Lookups)

Replace O(n) page filtering with O(1) index lookups in templates.

The Problem

kida

KIDA

{# O(n) - scans ALL pages on every request #}
{% let blog_posts = site.pages | where('section', 'blog') %}

On a 10K page site, this filter runs 10,000 comparisons.

The Solution

kida

KIDA

{# O(1) - instant hash lookup #}
{% let blog_posts = site.indexes.section.get('blog') | resolve_pages %}

Built-in Indexes

Index	Key Type	Example
`section`	Section name	`site.indexes.section.get('blog')`
`author`	Author name	`site.indexes.author.get('Jane')`
`category`	Category	`site.indexes.category.get('tutorial')`
`date_range`	Year or Year-Month	`site.indexes.date_range.get('2024')`

Usage Examples

Section-based listing:

kida

KIDA

{% let blog_posts = site.indexes.section.get('blog') | resolve_pages %}
{% for post in blog_posts | sort_by('date', reverse=true) %}
  <h2>{{ post.title }}</h2>
{% end %}

Author archive:

kida

KIDA

{% let author_posts = site.indexes.author.get('Jane Smith') | resolve_pages %}
<p>{{ author_posts | length }} posts by Jane</p>

Monthly archives:

kida

KIDA

{% let jan_posts = site.indexes.date_range.get('2024-01') | resolve_pages %}
{% for post in jan_posts %}
  {{ post.title }}
{% end %}

Performance Impact

Pages	O(n) Filter	Query Index
1K	2ms	<0.1ms
10K	20ms	<0.1ms
100K	200ms	<0.1ms

3. Parallel Processing

Parallel processing is auto-detected based on page count and workload. Adjust worker count if needed:

toml

TOML

# bengal.toml
[build]
max_workers = 8           # Optional: adjust based on CPU cores (auto-detected if omitted)

To force sequential processing (useful for debugging):

BASH

bengal build --no-parallel

Free-Threaded Python

Bengal automatically detects Python 3.14t+ (free-threaded):

BASH

# 1.5-2x faster rendering
# Install free-threaded Python:
pyenv install 3.14t
python3.14t -m pip install bengal

See Free-Threading for architecture details and code references.

When running on free-threaded Python:

ThreadPoolExecutor gets true parallelism (no GIL contention)
~1.78x faster rendering on multi-core machines
No code changes needed

4. Incremental Builds

Incremental builds are automatic — no configuration needed. First build is full, subsequent builds only rebuild changed content. Force a full rebuild if needed:

BASH

# Force full rebuild (skip cache)
bengal build --no-incremental

What Gets Cached

Content parsing — Markdown AST cached per file
Template rendering — Output cached by content hash
Asset hashing — Fingerprints cached
Query indexes — Updated incrementally
Autodoc AST parsing — Python modules cached to skip AST parsing (30-40% speedup for autodoc-heavy sites)
Asset dependencies — Tracked during render-time (no HTML parsing needed)

Cache Location

tree-sitter-query

TREE-SITTER-QUERY

.bengal/
├── cache.json.zst          # Main build cache (compressed)
├── page_metadata.json.zst  # Page discovery cache
├── taxonomy_index.json.zst # Taxonomy index
├── indexes/                # Query indexes (section, author, etc.)
├── templates/              # Template bytecode cache
└── logs/                   # Build logs

Clear Cache

BASH

# Clear all caches (forces cold rebuild)
bengal clean --cache

# Clear output and cache
bengal clean --all

5. Fast Mode

Combine all optimizations for maximum speed:

BASH

bengal build --fast

--fastenables:

Quiet output (minimal console I/O)
Suppresses verbose logging
Parallelism auto-detected as normal
Skips HTML formatting (raw HTML output, ~10-15% faster)

Note

Fast mode skips HTML pretty-printing and minification. Output is still valid HTML but not formatted. Use for development and CI builds where formatting doesn't matter.

6. Build Profiling

Identify bottlenecks:

BASH

# Generate performance profile
bengal build --perf-profile

# View results
python -m pstats .bengal/profiles/profile.stats

Template Profiling

Find slow templates:

BASH

bengal build --profile-templates

Output:

plaintext

PLAINTEXT

Template Rendering Times:
  layouts/blog.html: 1.2s (340 pages, 3.5ms avg)
  layouts/docs.html: 0.8s (890 pages, 0.9ms avg)
  partials/nav.html: 0.3s (included 1230 times)

7. Content Organization

Split Large Sections

If one section has 5K+ pages, consider splitting:

tree-sitter-query

TREE-SITTER-QUERY

content/
├── blog/
│   ├── 2024/     # 500 pages
│   ├── 2023/     # 800 pages
│   └── archive/  # 3000+ pages (separate pagination)

Use Pagination

Don't render 1000 items on one page:

yaml

YAML

# Paginate blog listing
pagination:
  per_page: 20

Lazy-Load Heavy Content

Move rarely-accessed content to separate pages:

kida

KIDA

{# Don't: render full changelog inline #}
{{ include('changelog.html') }}

{# Do: link to separate page #}
<a href="/changelog/">View full changelog</a>

8. CI/CD Optimization

GitHub Actions Example

yaml

YAML

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Cache Bengal
        uses: actions/cache@v4
        with:
          path: .bengal
          key: bengal-${{ hashFiles('content/**/*.md') }}

      - name: Build
        run: bengal build --fast --environment production

Docker Memory Limits

dockerfile

DOCKERFILE

# Use memory-optimized for container builds
CMD ["bengal", "build", "--memory-optimized", "--fast"]

9. Monitoring Build Health

Track build performance over time:

BASH

# Detailed build stats
bengal build --verbose

Output:

plaintext

PLAINTEXT

Build Summary:
  Total Pages: 15,432
  Rendered: 342 (incremental)
  Skipped: 15,090 (cached)
  Duration: 12.3s
  Memory Peak: 245MB
  Pages/sec: 1,254

Quick Reference

BASH

# Memory-efficient large site build
bengal build --memory-optimized --fast

# Profile to find bottlenecks
bengal build --perf-profile --profile-templates

# Force full rebuild
bengal build --no-incremental

# Clear all caches
bengal clean --cache

# Clear output and cache
bengal clean --all

Troubleshooting

Build runs out of memory

Increase available memory or swap space
Usebengal build --dev --verboseto watch memory usage
Try the experimental streaming mode (--memory-optimized) and measure peak memory with and without it — it is not guaranteed to help on every site

Build is slow despite caching

Check what's invalidating cache:bengal build --verbose
Profile templates:--profile-templates
Check for O(n) filters in templates (use query indexes)

Incremental not working

Ensure.bengal/is not gitignored for local dev
Runbengal clean --cacheto reset
Check for template changes that invalidate all pages

Seealso