Functions
_inline_text
1
str
▼
Recursively extract plain text from inline nodes.
_inline_text
1
str
▼
def _inline_text(node: Inline) -> str
Parameters
| Name | Type | Description |
|---|---|---|
node |
Inline |
Returns
str
_inline_text_html
1
str
▼
Recursively extract HTML from inline nodes (preserves strong, emphasis, links).
_inline_text_html
1
str
▼
def _inline_text_html(node: Inline) -> str
Recursively extract HTML from inline nodes (preserves strong, emphasis, links).
Parameters
| Name | Type | Description |
|---|---|---|
node |
Inline |
Returns
str
_block_text
2
str
▼
Extract plain text from a block node.
_block_text
2
str
▼
def _block_text(node: Block, source: str) -> str
Parameters
| Name | Type | Description |
|---|---|---|
node |
Block |
|
source |
str |
Returns
str
_block_text_html
2
str
▼
Extract HTML from a block node (preserves inline formatting, uses block element…
_block_text_html
2
str
▼
def _block_text_html(node: Block, source: str) -> str
Extract HTML from a block node (preserves inline formatting, uses block elements).
Parameters
| Name | Type | Description |
|---|---|---|
node |
Block |
|
source |
str |
Returns
str
extract_excerpt
6
str
▼
Extract excerpt from AST. Stops at block boundaries.
Walks blocks in order, ex…
extract_excerpt
6
str
▼
def extract_excerpt(ast: Document | Sequence[Block], source: str = '', *, max_chars: int = 750, skip_leading_h1: bool = True, include_headings: bool = True, excerpt_as_html: bool = False) -> str
Extract excerpt from AST. Stops at block boundaries.
Walks blocks in order, extracting text. Skips leading h1 by default. Stops when accumulated text reaches max_chars, always at a block boundary.
Parameters
| Name | Type | Description |
|---|---|---|
ast |
Document | Sequence[Block] |
Document or sequence of Block nodes |
source |
str |
Original source (for FencedCode zero-copy extraction) Default:''
|
max_chars |
int |
Maximum characters (default 250) Default:750
|
skip_leading_h1 |
bool |
Skip first Heading(level=1) (default True) Default:True
|
include_headings |
bool |
Include heading text in excerpt (default True) Default:True
|
excerpt_as_html |
bool |
If True, output block elements ( , ) for structure, preserving , , (default False)
Default: |
Returns
str
_truncate_at_word
3
str
▼
Truncate at word boundary within length.
_truncate_at_word
3
str
▼
def _truncate_at_word(text: str, length: int, suffix: str = '...') -> str
Parameters
| Name | Type | Description |
|---|---|---|
text |
str |
|
length |
int |
|
suffix |
str |
Default:'...'
|
Returns
str
_truncate_at_sentence
3
str
▼
Truncate at sentence boundary. Falls back to word boundary if needed.
_truncate_at_sentence
3
str
▼
def _truncate_at_sentence(text: str, length: int = 160, min_ratio: float = 0.6) -> str
Parameters
| Name | Type | Description |
|---|---|---|
text |
str |
|
length |
int |
Default:160
|
min_ratio |
float |
Default:0.6
|
Returns
str
extract_meta_description
3
str
▼
Extract SEO-friendly meta description from AST.
Same logic as extract_excerpt …
extract_meta_description
3
str
▼
def extract_meta_description(ast: Document | Sequence[Block], source: str = '', *, max_chars: int = 160) -> str
Extract SEO-friendly meta description from AST.
Same logic as extract_excerpt but prefers sentence boundary at 160 chars. Uses extract_excerpt with larger buffer then truncates at sentence.
Parameters
| Name | Type | Description |
|---|---|---|
ast |
Document | Sequence[Block] |
Document or sequence of Block nodes |
source |
str |
Original source (for FencedCode) Default:''
|
max_chars |
int |
Maximum length (default 160, SEO standard) Default:160
|
Returns
str