# Object Model

URL: /bengal/docs/reference/architecture/core/object-model/
Section: core
Description: Site, Page, Section, and Asset data models.

---

> For a complete page index, fetch /bengal/llms.txt.

Bengal’s object model defines how `Site`, `Page`, `Section`, `Asset`, and `Menu` relate, and where cacheable metadata lives (`PageCore`).

## Read this first

- **If you want the object graph**: start with “Core Objects”, then “Object Model Relationships”.
- **If you want the cache contract**: start with “PageCore” (and refer to `bengal/core/page/page_core.py`).

## Core Objects

::::{tab-set}
:::{tab-item} Site
**Central data container** (`bengal/core/site/`)

Holds all site content and delegates build coordination to orchestrators.

**Key Attributes:**
- `pages`: List of all Page objects
- `sections`: List of all Section objects
- `assets`: List of all Asset objects
- `config`: Configuration dictionary
- `menu`: Built navigation menus

**Key Methods:**
- `build()`: Delegates to `BuildOrchestrator`
- `discover_content()`: Delegates to `ContentOrchestrator`
:::

:::{tab-item} Page
**Content Unit** (`bengal/core/page/`)

Represents a single content page with source, metadata, rendered HTML, and navigation.

**Architecture:**
- **Composition Pattern**: `Page` contains a `PageCore` instance for cacheable metadata
- **Plain dataclass with helper modules**:
  - `page_core.py`: Cacheable metadata (title, date, tags, etc.)
  - `metadata_helpers.py`: Frontmatter-derived metadata helpers
  - `content.py`: Content compatibility helpers
  - `relationships.py`: Section membership
  - `computed.py`: Compatibility wrappers for computed values
- **Rendering-owned behavior**: rendered content, excerpts, TOC extraction, shortcode checks,
  template URLs, and page bundle resource access/classification delegate to
  helpers under `bengal/rendering/`

**PageCore Integration:**
- Cacheable fields (title, date, tags, slug) stored in `page.core`
- Property delegates provide direct access: `page.title` → `page.core.title`
- Enables type-safe caching; pages are reconstructed from cache during incremental builds
:::

:::{tab-item} Section
**Structural Unit** (`bengal/core/section/`)

Represents folder-based grouping of pages with hierarchical organization.

**Architecture:**
- **Plain dataclass with helper modules**:
  - `hierarchy.py`: Tree traversal, parent/child, and identity helper functions behind `Section` shims
  - `navigation.py`: Version-aware structural navigation helper functions behind `Section` shims
  - `queries.py`: Page retrieval, sorting, and index helper functions behind `Section` shims
- **Rendering-owned URLs**: `href`, `_path`, `absolute_href`, subsection index
  URL sets, and version-path transforms delegate to `bengal/rendering/section_urls.py`
- **Rendering-owned ergonomics**: theme/navigation-facing helpers such as
  `icon`, `has_nav_children`, `recent_pages()`, `featured_posts()`, content
  stats, and section template application delegate to
  `bengal/rendering/section_ergonomics.py`

**Features:**
- **Hierarchy**: Parent/child relationships (`subsections`)
- **Navigation**: Access to `regular_pages` and `sections`
- **Cascade**: Inheritance of frontmatter metadata to descendants
- **Path-based Registry**: O(1) lookup via `Site.registry` (ContentRegistry) using normalized paths
- **Stable References**: Sections referenced by path strings (not object identity) for reliable incremental builds
:::

:::{tab-item} Asset
**Static Resource** (`bengal/core/asset/`)

Handles static files (images, CSS, JS) with optimization.

**Capabilities:**
- Minification (CSS/JS)
- Image optimization
- Cache busting (fingerprinting)
- Output copying
:::

:::{tab-item} Menu
**Navigation Structure** (`bengal/core/menu.py`)

Provides hierarchical navigation menus built from config + frontmatter.

**Components:**
- `MenuItem`: Nested item with active state
- `MenuBuilder`: Constructs hierarchy and marks active items
:::
::::

## PageCore

`PageCore` is the single cacheable metadata structure shared by:

- **`Page`**: via `page.core` and property delegates
- **`PageMetadata`**: a type alias of `PageCore` used by caches
- **`SourcePage`**: frozen record from discovery that composes `PageCore` for cache-compatible metadata

Refer to:
- `bengal/core/page/page_core.py`
- `bengal/cache/page_discovery_cache.py`
- `bengal/core/records.py`

## Page Record Migration Adapters

The immutable pipeline migration uses adapter helpers in `bengal/core/records.py`
as the canonical handoff from mutable `Page` compatibility state into record
state:

- `build_page_core()` maps frontmatter and path inputs into `PageCore`.
- `build_source_page()` maps discovery inputs into `SourcePage`; discovery
  consumes that record while populating the remaining `Page` compatibility
  object, but does not retain it as a second mutable sidecar on `Page`.
- `parsed_page_from_page_state()` maps parse-phase `PageLike` state into
  `ParsedPage` while accepting rendering-supplied TOC structures.
- `rendered_page_from_page_state()` maps render-phase state into `RenderedPage`.

The module also exposes `*_MIGRATION_MAP` constants for tests and migration
notes. These maps are documentation, not a new public protocol: callers should
prefer the adapters over adding attributes to `PageLike` or depending on the
concrete mutable `Page` class.

::::{dropdown} Contributor notes: adding fields and deciding what belongs in PageCore
When you add a cacheable field:

1. Add it to `PageCore` (`bengal/core/page/page_core.py`)
2. Update the migration adapter/map in `bengal/core/records.py`
3. Add a property delegate to `Page` (`bengal/core/page/__init__.py`) when
   templates or compatibility callers need it

Include fields that are stable, JSON-serializable, and useful without full content parsing. Keep build artifacts and parsed-content-derived fields out of `PageCore`.
::::

## Stable Section References

Bengal uses **path-based section references** instead of object identity for reliable incremental builds.

### Path-Based Registry

Sections are stored in a dictionary keyed by normalized paths:

```python
class Site:
    @property
    def registry(self) -> ContentRegistry:
        """Central registry for O(1) page/section lookups."""
        ...

    def get_section_by_path(self, path: Path | str) -> Section | None:
        """Delegate to ContentRegistry for O(1) lookup."""
        return self.registry.get_section(path)
```

### Benefits

- **Stable Across Rebuilds**: Path strings persist in cache, not object references
- **O(1) Lookup**: Dictionary lookup is constant time
- **Reliable Incremental Builds**: Sections can be renamed/moved without breaking references

### Implementation

- Sections stored as path strings in `PageCore.section` (not Section objects)
- Registry built during `Site.register_sections()`

Refer to `bengal/core/site/section_registry.py` for the registry implementation.

## Object Model Relationships

```mermaid
classDiagram
    Site "1" --> "*" Page : manages
    Site "1" --> "*" Section : contains
    Site "1" --> "*" Asset : tracks
    Site "1" --> "*" MenuBuilder : uses
    MenuBuilder "1" --> "*" MenuItem : builds
    Section "1" --> "*" Page : groups
    Section "1" o-- "0..1" Page : index_page
    Section "1" --> "*" Section : subsections
    Section --> Section : parent
    Page --> Page : next/prev
    Page --> Page : next_in_section/prev_in_section
    Page --> Section : parent
    MenuItem --> MenuItem : children (nested)

    class Site {
        +root_path: Path
        +config: Dict
        +pages: List~Page~
        +sections: List~Section~
        +build()
    }

    class Page {
        +core: PageCore
        +content: str
        +rendered_html: str
        +render()
    }

    class PageCore {
        +source_path: str
        +title: str
        +date: datetime
        +tags: list
        +section: str
    }

    Page "1" *-- "1" PageCore : contains

    class Section {
        +name: str
        +path: Path
        +pages: List~Page~
        +subsections: List~Section~
    }
```

## URL Ownership

Bengal uses a **URL ownership system** with claim-time enforcement to prevent URL collisions and ensure explicit ownership policy across all content producers.

### URLRegistry

The `URLRegistry` on `Site` is the central authority for URL claims. It enforces ownership at claim time (before file writes), preventing invalid states from being created.

**Key Features**:
- **Claim-time enforcement**: URLs are claimed before any file is written
- **Priority-based resolution**: Higher priority claims win conflicts
- **Ownership context**: All claims include owner, source, and priority metadata
- **Incremental safety**: Claims are cached and loaded for incremental builds

**Usage**:
```python
# Claim a URL (done automatically by orchestrators)
site.url_registry.claim(
    url="/about/",
    owner="content",
    source="content/about.md",
    priority=100,  # User content (highest priority)
)

# Claim via output path (for direct file writers)
url = site.url_registry.claim_output_path(
    output_path=Path("public/about/index.html"),
    site=site,
    owner="content",
    source="content/about.md",
    priority=100,
)
```

### Priority Levels

URL claims use priority levels to resolve conflicts:

| Priority | Owner | Rationale |
|----------|-------|-----------|
| 100 | User content | User intent always wins |
| 90 | Autodoc sections | Explicitly configured by user |
| 80 | Autodoc pages | Derived from sections |
| 50 | Section indexes | Structural authority |
| 40 | Taxonomy | Auto-generated |
| 10 | Special pages | Fallback utility pages |
| 5 | Redirects | Should never shadow actual content |

**Conflict Resolution**:
- Higher priority wins (user content can override generated content)
- Same priority + same source = idempotent (allowed)
- Same priority + different source = collision error

### Reserved Namespaces

Certain URL namespaces are reserved for specific generators:

- `/tags/` - Reserved for taxonomy (priority 40)
- `/search/`, `/404/`, `/graph/` - Reserved for special pages (priority 10)
- Autodoc prefixes (e.g., `/cli/`, `/api/python/`) - Reserved for autodoc output (priority 90/80)

The `OwnershipPolicyValidator` warns when user content lands in reserved namespaces.

### Integration Points

URLRegistry is integrated across all content producers:

- **ContentDiscovery**: Claims URLs for user content (priority 100)
- **SectionOrchestrator**: Claims section index URLs (priority 50)
- **TaxonomyOrchestrator**: Claims taxonomy URLs (priority 40)
- **AutodocOrchestrator**: Claims autodoc URLs (priority 90/80)
- **RedirectGenerator**: Claims redirect URLs (priority 5)
- **SpecialPagesGenerator**: Claims special page URLs (priority 10)

### Incremental Build Safety

URL claims are persisted in `BuildCache` and loaded during incremental builds. This prevents new content from shadowing existing URLs that weren't rebuilt in the current build.

**Cache Integration**:
- Claims are saved to `BuildCache.url_claims` after build completes
- Cached claims are loaded during discovery phase for incremental builds
- Registry is pre-populated with claims from pages not being rebuilt

### Error Handling

When a collision is detected, `URLCollisionError` is raised with diagnostic information:

```
URL collision detected: /about/
  Existing claim: content (priority 100)
    Source: content/about.md
  New claim: taxonomy (priority 40)
    Source: tags/about
  Priority: Existing claim has higher priority (100 > 40) - new claim rejected
  Tip: Check for duplicate slugs, conflicting autodoc output, or namespace violations
```

### See Also

- `bengal/core/url_ownership.py` - URLRegistry implementation
- `bengal/core/url_collisions.py` - Structured URL collision records used by health checks and CLI output
- `bengal/config/url_policy.py` - Reserved namespace definitions
