Classes
TokenType
0
▼
Semantic token types with Pygments-compatible CSS class names.
Each value is the CSS class suffix …
TokenType
0
▼
Semantic token types with Pygments-compatible CSS class names.
Each value is the CSS class suffix used by Pygments themes. This ensures drop-in compatibility with existing Pygments stylesheets.
Categories:
Keywords: KEYWORD, KEYWORD_CONSTANT, KEYWORD_DECLARATION, etc.
Names: NAME, NAME_FUNCTION, NAME_CLASS, NAME_BUILTIN, etc.
Literals: STRING, NUMBER, NUMBER_FLOAT, etc.
Operators: OPERATOR, OPERATOR_WORD
Punctuation: PUNCTUATION, PUNCTUATION_MARKER
Comments: COMMENT, COMMENT_SINGLE, COMMENT_MULTILINE, etc.
Generic: TEXT, WHITESPACE, ERROR (for diffs, errors, etc.)
Usage:
>>> from rosettes import TokenType
>>> TokenType.KEYWORD
<TokenType.KEYWORD: 'k'>
>>> TokenType.KEYWORD.value # CSS class suffix
'k'
Token
4
▼
Immutable token — thread-safe, minimal memory.
A Token represents a single lexical unit from sourc…
Token
4
▼
Immutable token — thread-safe, minimal memory.
A Token represents a single lexical unit from source code. Tokens are immutable NamedTuples for thread-safety and memory efficiency.
Memory: Each Token uses ~64 bytes (NamedTuple overhead + references). A typical 100-line Python file produces ~500 tokens (~32KB).
Thread-Safety: Tokens are immutable and can be safely shared across threads. No defensive copying needed when passing tokens between workers.
Fast Path: When position info is not needed, use tokenize_fast() which yields (TokenType, str) tuples instead of Token objects for ~20% speedup.
Attributes
| Name | Type | Description |
|---|---|---|
type |
TokenType
|
The semantic type of the token (e.g., TokenType.KEYWORD). |
value |
str
|
The actual text content of the token (e.g., "def"). |
line |
int
|
1-based line number where token starts. |
column |
int
|
1-based column number where token starts. |