Module

rendering.pipeline.toc

Table of Contents (TOC) extraction for rendering pipeline.

Provides TOC parsing and structure extraction from rendered HTML.

Related Modules:

  • bengal.rendering.pipeline.core: Uses TOC extraction
  • bengal.core.page: Page model with toc_items property

Functions

extract_toc_structure
Parse TOC HTML into structured data for custom rendering. Handles both nested <ul> structures (pyt…
1 list[dict[str, Any]]
def extract_toc_structure(toc_html: str) -> list[dict[str, Any]]

Parse TOC HTML into structured data for custom rendering.

Handles both nested <ul> structures (python-markdown style) and flat lists (mistune style). For flat lists from mistune, parses indentation to infer heading levels.

This is a standalone function so it can be called from Page.toc_items property for lazy evaluation.

Parameters 1

Name Type Default Description
toc_html str

HTML table of contents

Returns

list[dict[str, Any]]

List of TOC items with id, title, and level (1=H2, 2=H3, 3=H4, etc.)