Classes
HtmlClassifierMixin
14
▼
Mixin providing HTML block classification.
Implements CommonMark 4.6 HTML block types 1-7.
HtmlClassifierMixin
14
▼
Mixin providing HTML block classification.
Implements CommonMark 4.6 HTML block types 1-7.
Attributes
| Name | Type | Description |
|---|---|---|
_mode |
LexerMode
|
— |
_html_block_type |
int
|
— |
_html_block_content |
list[str]
|
— |
_html_block_start |
int
|
— |
_html_block_indent |
int
|
— |
_pos |
int
|
— |
_source_len |
int
|
— |
_consumed_newline |
bool
|
— |
Methods
Internal Methods 6 ▼
_location_from
3
SourceLocation
▼
Get source location from saved position. Implemented by Lexer.
_location_from
3
SourceLocation
▼
def _location_from(self, start_pos: int, start_col: int | None = None, end_pos: int | None = None) -> SourceLocation
Parameters
| Name | Type | Description |
|---|---|---|
start_pos |
— |
|
start_col |
— |
Default:None
|
end_pos |
— |
Default:None
|
Returns
SourceLocation
_try_classify_html_block_start
4
Iterator[Token] | None
▼
Try to classify content as HTML block start.
CommonMark 4.6 defines 7 types of…
_try_classify_html_block_start
4
Iterator[Token] | None
▼
def _try_classify_html_block_start(self, content: str, line_start: int, full_line: str, indent: int = 0) -> Iterator[Token] | None
Try to classify content as HTML block start.
CommonMark 4.6 defines 7 types of HTML blocks.
Parameters
| Name | Type | Description |
|---|---|---|
content |
— |
Line content with leading whitespace stripped |
line_start |
— |
Position in source where line starts |
full_line |
— |
The full line including leading whitespace |
indent |
— |
Number of leading spaces (for line_indent) Default:0
|
Returns
Iterator[Token] | None
Iterator yielding HTML_BLOCK token, or None if not HTML block.
_extract_html_tag_name
1
str | None
▼
Extract tag name from HTML opening or closing tag.
_extract_html_tag_name
1
str | None
▼
def _extract_html_tag_name(self, content: str) -> str | None
Parameters
| Name | Type | Description |
|---|---|---|
content |
— |
Line content starting with < |
Returns
str | None
Tag name if found, None otherwise.
_is_complete_html_tag
1
bool
▼
Check if content is a complete single HTML open/close tag.
Type 7 HTML blocks …
_is_complete_html_tag
1
bool
▼
def _is_complete_html_tag(self, content: str) -> bool
Check if content is a complete single HTML open/close tag.
Type 7 HTML blocks require a SINGLE complete tag that's the only content on line.
This means:
The tag name must also NOT be one of the type 6 block-level tags. Must not match autolinks like http://... or email@domain.
CommonMark strict attribute validation:
- Attribute name: [a-zA-Z_:][a-zA-Z0-9_.:-]*
- Attribute value: unquoted (no special chars), 'single', or "double" quoted
- Space required between attributes (but not after final attribute before > or />)
Parameters
| Name | Type | Description |
|---|---|---|
content |
— |
Line content |
Returns
bool
True if this is a complete HTML tag.
_validate_html_attributes
1
bool
▼
Validate HTML attribute string per CommonMark spec.
_validate_html_attributes
1
bool
▼
def _validate_html_attributes(self, attrs_str: str) -> bool
Parameters
| Name | Type | Description |
|---|---|---|
attrs_str |
— |
The portion after tag name and before > (without leading |
Returns
bool
True if attributes are valid per CommonMark 6.8.
_emit_html_block
0
Iterator[Token]
▼
Emit accumulated HTML block as a single token.
_emit_html_block
0
Iterator[Token]
▼
def _emit_html_block(self) -> Iterator[Token]
Returns
Iterator[Token]