# Content Collections URL: /bengal/docs/0.4.3/content/collections/ Section: collections Tags: collections, schemas, validation -------------------------------------------------------------------------------- Content Collections Define typed schemas for your content to ensure consistency and catch errors early. Do I Need This? No. Collections are optional. Your site works fine without them. Use collections when: You want typos caught at build time, not in production Multiple people edit content and need guardrails You want consistent frontmatter across content types Quick Setup Create a collections.py file at your project root. Edit it to uncomment what you need: from bengal.collections import define_collection, BlogPost, DocPage collections = { "blog": define_collection(schema=BlogPost, directory="blog"), "docs": define_collection(schema=DocPage, directory="docs"), } Done. Build as normal—validation happens automatically. Built-in Schemas Bengal provides schemas for common content types: Schema Alias Required Fields Optional Fields BlogPost Post title, date author, tags, draft, description, image, excerpt DocPage Doc title weight, category, tags, toc, deprecated, description, since APIReference API title, endpoint method, version, auth_required, rate_limit, deprecated, description Tutorial — title difficulty, duration, prerequisites, series, tags, order Changelog — title, date version, breaking, summary, draft Import any of these: from bengal.collections import BlogPost, DocPage, APIReference, Tutorial, Changelog # Or use short aliases: from bengal.collections import Post, Doc, API Custom Schemas Define your own using Python dataclasses: from dataclasses import dataclass, field from datetime import datetime @dataclass class ProjectPage: title: str status: str # "active", "completed", "archived" started: datetime tech_stack: list[str] = field(default_factory=list) github_url: str | None = None collections = { "projects": define_collection( schema=ProjectPage, directory="projects", ), } Validation Modes By default, validation warns but doesn't fail builds: ⚠ content/blog/my-post.md └─ date: Required field 'date' is missing Strict Mode To fail builds on validation errors, add to bengal.toml: [build] strict_collections = true Lenient Mode (Extra Fields) To allow frontmatter fields not defined in your schema: define_collection( schema=BlogPost, directory="blog", strict=False, # Don't reject unknown fields allow_extra=True, # Store extra fields in _extra dict ) With strict=False, unknown fields are silently ignored. Add allow_extra=True to preserve them in a _extra attribute on the validated instance. CLI Commands # List defined collections and their schemas bengal content collections # Validate content against schemas without building bengal content schemas # Validate specific collection bengal content schemas --collection blog Advanced Options Custom File Pattern By default, collections match all markdown files (**/*.md). To match specific files: define_collection( schema=BlogPost, directory="blog", glob="*.md", # Only top-level, not subdirectories ) Migration Tips Existing site with inconsistent frontmatter? Start with strict=False to allow extra fields Run bengal content schemas to find issues Fix content or adjust schema Enable strict=True when ready Transform legacy field names: def migrate_legacy(data: dict) -> dict: if "post_title" in data: data["title"] = data.pop("post_title") return data collections = { "blog": define_collection( schema=BlogPost, directory="blog", transform=migrate_legacy, ), } Remote Content Collections work with remote content too. Use a loader instead of a directory: from bengal.collections import define_collection, DocPage from bengal.content.sources import github_loader collections = { "api-docs": define_collection( schema=DocPage, loader=github_loader(repo="myorg/api-docs", path="docs/"), ), } See Content Sources for GitHub, Notion, REST API loaders. Info Seealso Content Sources — GitHub, Notion, REST API loaders -------------------------------------------------------------------------------- Metadata: - Author: lbliii - Word Count: 483 - Reading Time: 2 minutes