# workers

URL: /kida/api/utils/workers/
Section: utils
Description: Worker pool auto-tuning utilities for free-threaded Python.

Provides workload-aware worker count calculation for ThreadPoolExecutor usage.
Calibrated for Python 3.14t (free-threading) where CPU-bound template rendering
can achieve true parallelism without GIL contention.

Key Features:
- Environment detection (CI vs local vs production)
- Free-threading detection (GIL status)
- Workload type profiles calibrated for no-GIL execution
- Template complexity estimation for optimal scheduling

Example:
    >>> from kida.utils.workers import get_optimal_workers, should_parallelize
    >>> contexts = [{"name": f"User {i}"} for i in range(100)]
    >>> if should_parallelize(len(contexts)):
    ...     workers = get_optimal_workers(len(contexts))
    ...     with ThreadPoolExecutor(max_workers=workers) as executor:
    ...         results = list(executor.map(template.render, contexts))

Note:
Profiles are calibrated for free-threaded Python (3.14t+).
On GIL-enabled Python, CPU-bound parallelism is limited.

---

> For a complete page index, fetch /kida/llms.txt.

Open LLM text
(/kida/api/utils/workers/index.txt)

Share with AI

Ask Claude
(https://claude.ai/new?q=Please%20help%20me%20understand%20this%20documentation%3A%20%2Fkida%2Fapi%2Futils%2Fworkers%2Findex.txt)

Ask ChatGPT
(https://chatgpt.com/?q=Please%20help%20me%20understand%20this%20documentation%3A%20%2Fkida%2Fapi%2Futils%2Fworkers%2Findex.txt)

Ask Gemini
(https://gemini.google.com/app?q=Please%20help%20me%20understand%20this%20documentation%3A%20%2Fkida%2Fapi%2Futils%2Fworkers%2Findex.txt)

Ask Copilot
(https://copilot.microsoft.com/?q=Please%20help%20me%20understand%20this%20documentation%3A%20%2Fkida%2Fapi%2Futils%2Fworkers%2Findex.txt)

Module

#
`utils.workers`

Worker pool auto-tuning utilities for free-threaded Python.

Provides workload-aware worker count calculation for ThreadPoolExecutor usage.
Calibrated for Python 3.14t (free-threading) where CPU-bound template rendering
can achieve true parallelism without GIL contention.

Key Features:

- Environment detection (CI vs local vs production)

- Free-threading detection (GIL status)

- Workload type profiles calibrated for no-GIL execution

- Template complexity estimation for optimal scheduling

Example:

```
>>> from kida.utils.workers import get_optimal_workers, should_parallelize
>>> contexts = [{"name": f"User {i}"} for i in range(100)]
>>> if should_parallelize(len(contexts)):
...     workers = get_optimal_workers(len(contexts))
...     with ThreadPoolExecutor(max_workers=workers) as executor:
...         results = list(executor.map(template.render, contexts))
```

Note:

Profiles are calibrated for free-threaded Python (3.14t+).
On GIL-enabled Python, CPU-bound parallelism is limited.

3Classes7Functions

## Classes

`WorkloadType`

3

▼

Workload characteristics for auto-tuning.

On free-threaded Python, CPU-bound work can now parallel…

Workload characteristics for auto-tuning.

On free-threaded Python, CPU-bound work can now parallelize effectively.
This changes optimal worker counts compared to GIL-enabled Python.

#### Attributes

Name
Type
Description

`RENDER`

—

Template rendering (CPU-bound, string operations). Primary workload for Kida. Benefits significantly from free-threading.

`COMPILE`

—

Template compilation/parsing (CPU-bound, AST operations). Moderate parallelism benefit due to shared cache access.

`IO_BOUND`

—

File loading, network operations. Can use more workers as threads wait on I/O.

`Environment`

3

▼

Execution environment for tuning profiles.

Execution environment for tuning profiles.

#### Attributes

Name
Type
Description

`CI`

—

Constrained CI runner (typically 2-4 vCPU). Use minimal workers to avoid resource contention.

`LOCAL`

—

Developer machine (typically 8-16 cores). Use moderate workers for good performance.

`PRODUCTION`

—

Server deployment (16+ cores). Can use more workers for high throughput.

`WorkloadProfile`

5

▼

Tuning profile for a workload type.

Tuning profile for a workload type.

#### Attributes

Name
Type
Description

`parallel_threshold`

`int`

Minimum tasks before parallelizing. Below this, thread overhead exceeds benefit.

`min_workers`

`int`

Floor for worker count.

`max_workers`

`int`

Ceiling for worker count.

`cpu_fraction`

`float`

Fraction of cores to use (0.0-1.0).

`free_threading_multiplier`

`float`

Extra scaling when GIL is disabled.

## Functions

`is_free_threading_enabled`

0

`bool`

▼

Check if Python is running with the GIL disabled.

`def is_free_threading_enabled() -> bool`

##### Returns

`bool`

`detect_environment`

0

`Environment`

▼

Auto-detect execution environment for tuning.

**Detection order:**
1. Explicit…

`def detect_environment() -> Environment`

Auto-detect execution environment for tuning.

Detection order:

- Explicit KIDA_ENV environment variable

- CI environment variables (GitHub Actions, GitLab CI, etc.)

- Default to LOCAL

##### Returns

`Environment`

`get_optimal_workers`

5

`int`

▼

Calculate optimal worker count based on workload characteristics.

Auto-scales …

`def get_optimal_workers(task_count: int, *, workload_type: WorkloadType = WorkloadType.RENDER, environment: Environment | None = None, config_override: int | None = None, task_weight: float = 1.0) -> int`

Calculate optimal worker count based on workload characteristics.

Auto-scales based on:

- Workload type (render vs compile vs I/O)

- Environment (CI vs local vs production)

- Free-threading status (GIL enabled/disabled)

- Available CPU cores (fraction based on workload)

- Task count (no point having more workers than tasks)

- Optional task weight for heavy/light work estimation

##### Parameters

Name
Type
Description

`task_count`
`int`

Number of tasks to process (e.g., contexts to render)

`workload_type`
`WorkloadType`

Type of work (RENDER, COMPILE, IO_BOUND)

Default:`WorkloadType.RENDER`

`environment`
`Environment | None`

Execution environment (auto-detected if None)

Default:`None`

`config_override`
`int | None`

User-configured value (bypasses auto-tune if > 0)

Default:`None`

`task_weight`
`float`

Multiplier for task count (>1 for heavy templates)

Default:`1.0`

##### Returns

`int`

`should_parallelize`

4

`bool`

▼

Determine if parallelization is worthwhile for this workload.

Thread pool over…

`def should_parallelize(task_count: int, *, workload_type: WorkloadType = WorkloadType.RENDER, environment: Environment | None = None, total_work_estimate: int | None = None) -> bool`

Determine if parallelization is worthwhile for this workload.

Thread pool overhead (~1-2ms per task) only pays off above threshold.
This function helps avoid the overhead for small workloads.

##### Parameters

Name
Type
Description

`task_count`
`int`

Number of tasks to process

`workload_type`
`WorkloadType`

Type of work

Default:`WorkloadType.RENDER`

`environment`
`Environment | None`

Execution environment (auto-detected if None)

Default:`None`

`total_work_estimate`
`int | None`

Optional size estimate (bytes of template output)

Default:`None`

##### Returns

`bool`

`estimate_template_weight`

1

`float`

▼

Estimate relative complexity of a template for worker scheduling.

Heavy templa…

`def estimate_template_weight(template: Template) -> float`

Estimate relative complexity of a template for worker scheduling.

Heavy templates (many blocks, macros, filters) get higher weights,
causing them to be scheduled earlier to avoid straggler effect.

Weight factors:

- Source size: +0.5 per 5KB above 5KB threshold

- Block count: +0.1 per block above 3

- Macro count: +0.2 per macro

- Inheritance: +0.5 if extends another template

##### Parameters

Name
Type
Description

`template`
`Template`

Template instance to estimate

##### Returns

`float`

`order_by_complexity`

2

`list[Template]`

▼

Order templates by estimated complexity for optimal worker utilization.

Schedu…

`def order_by_complexity(templates: list[Template], *, descending: bool = True) -> list[Template]`

Order templates by estimated complexity for optimal worker utilization.

Scheduling heavy templates first reduces the "straggler effect" where
one slow render delays overall completion.

##### Parameters

Name
Type
Description

`templates`
`list[Template]`

List of templates to order

`descending`
`bool`

If True, heaviest first (default for parallel execution)

Default:`True`

##### Returns

`list[Template]`

`get_profile`

2

`WorkloadProfile`

▼

Get the workload profile for inspection or testing.

`def get_profile(workload_type: WorkloadType, environment: Environment | None = None) -> WorkloadProfile`

##### Parameters

Name
Type
Description

`workload_type`
`WorkloadType`

Type of work

`environment`
`Environment | None`

Execution environment (auto-detected if None)

Default:`None`

##### Returns

`WorkloadProfile`