Module

`rendering.pipeline.unified_transform`

Unified HTML Transform - Optimized content transformation for rendering.

This module provides an optimized HTML transformation approach that combines multiple passes into an efficient sequence with quick rejection checks.

Performance:

Benchmarked at ~27% faster than separate transform calls. See: scripts/benchmark_transforms.py

Architecture:

Step 1: Jinja escaping via str.replace() (C-optimized, very fast)
Step 2: .md link normalization (single regex pass with quick rejection)
Step 3: Internal link baseurl prefixing (single regex pass with quick rejection)

The key optimizations are:

Quick rejection checks before regex operations
Single transformer instance reused across pages
Compiled regex patterns

Related Modules:

bengal.rendering.pipeline.transforms: Original separate transforms
bengal.rendering.pipeline.core: Uses this transformer
bengal.rendering.link_transformer: Link transformation patterns

RFC Reference:

plan/drafted/rfc-rendering-package-optimizations.md

Classes

HybridHTMLTransformer 4 ▼

Optimized HTML transformer combining multiple transformation passes. This transformer applies Jinj…

Optimized HTML transformer combining multiple transformation passes.

This transformer applies Jinja escaping and link transformations in an optimized sequence with quick rejection checks to skip unnecessary work.

Creation: transformer = HybridHTMLTransformer(baseurl="/bengal") result = transformer.transform(html)

Thread Safety: Thread-safe. Transformer instances are stateless after initialization and can be safely shared across threads.

Performance: Approximately 27% faster than calling separate transform functions. Improvement is most significant for pages with transformable content.

Methods

transform 1 str ▼

Transform HTML content with optimized multi-pass approach. **Applies transform…

def transform(self, html: str) -> str

Transform HTML content with optimized multi-pass approach.

Applies transformations in sequence:

Jinja block escaping ({%, %})
Markdown link normalization (.md -> /)
Internal link baseurl prefixing (/ -> /baseurl/)

Each step includes quick rejection to skip unnecessary regex work.

Parameters

Name	Type	Description
`html`	`—`	HTML content to transform

Returns

str Transformed HTML content

Internal Methods 3 ▼

__init__ 1 ▼

Initialize the transformer.

def __init__(self, baseurl: str = '') -> None

Parameters

Name	Type	Description
`baseurl`	`—`	Base URL prefix for internal links (e.g., "/bengal"). If empty, internal link transformation is skipped. Default:`''`

_md_replacer 1 str ▼

Transform .md link to clean URL, preserving anchors. Handles special cases: - …

def _md_replacer(self, match: re.Match[str]) -> str

Transform .md link to clean URL, preserving anchors.

Handles special cases:

./page.md -> ./page/
./page.md#section -> ./page/#section
./_index.md -> ./
../other.md -> ../other/
path/page.md -> path/page/

Parameters

Name	Type	Description
`match`	`—`

Returns

str

_internal_replacer 1 str ▼

Transform internal link with baseurl prefix. Prepends baseurl to internal link…

def _internal_replacer(self, match: re.Match[str]) -> str

Transform internal link with baseurl prefix.

Prepends baseurl to internal links starting with /. Skips links that already have the baseurl prefix.

Parameters

Name	Type	Description
`match`	`—`

Returns

str

Functions

create_transformer 1 HybridHTMLTransformer ▼

Create a transformer instance from site config. Factory function that extracts…

def create_transformer(config: dict[str, Any]) -> HybridHTMLTransformer

Create a transformer instance from site config.

Factory function that extracts baseurl from config and creates an appropriately configured transformer.

Parameters

Name	Type	Description
`config`	`dict[str, Any]`	Site configuration dictionary

Returns

HybridHTMLTransformer