Workers

Worker Modes

Pounce automatically selects the worker mode based on the Python runtime:

Runtime	Workers Are	Detection
Python 3.14t (free-threading)	Threads	`sys._is_gil_enabled()` returns `False`
Standard CPython (GIL)	Processes	`sys._is_gil_enabled()` returns `True`

You don't need to configure this — the supervisor detects it at startup.

Configuring Worker Count

# Single worker (no supervisor, lowest overhead)
pounce myapp:app --workers 1

# Auto-detect from CPU cores
pounce myapp:app --workers 0

# Explicit count
pounce myapp:app --workers 4

Single Worker (workers=1)

The server skips the supervisor entirely and runs the worker directly in the main thread. This is the lowest-overhead mode — no supervisor thread, no health monitoring. Good for development and low-traffic services.

Auto-Detect (workers=0)

Callsos.cpu_count()and uses that as the worker count (minimum 1). On an 8-core machine, this creates 8 workers.

Multi-Worker (workers=2+)

The supervisor spawns N workers and monitors their health. Workers that crash are automatically restarted (max 5 restarts per 60-second window).

Thread vs Process Workers

flowchart LR subgraph ft ["Free-Threading (3.14t)"] direction TB P1["1 Process"] --> T1["Thread 1\nevent loop"] P1 --> T2["Thread 2\nevent loop"] P1 --> TN["Thread N\nevent loop"] T1 ~~~ Mem1["Shared Memory\nconfig, app, routes"] T2 ~~~ Mem1 TN ~~~ Mem1 end subgraph gil ["GIL Build"] direction TB Sup["Supervisor"] --> PR1["Process 1\nevent loop"] Sup --> PR2["Process 2\nevent loop"] Sup --> PRN["Process N\nevent loop"] PR1 ~~~ Mem2["Isolated Memory ×N"] end

Thread Workers (Free-Threading)

On Python 3.14t, each worker is a thread with its own asyncio event loop:

Shared memory — All workers see the sameServerConfig, app reference, and route tables
No IPC — Threads communicate through shared memory, not pipes
Lower memory — One copy of the application, not N copies
SO_REUSEPORT — Kernel distributes incoming connections across workers

Process Workers (GIL)

On GIL builds, each worker is a separate process:

Isolated memory — Each process has its own copy of everything
Fork-based — Similar to Uvicorn/Gunicorn multi-process model
Higher memory — N copies of the application

Tuning Guidelines

Workload	Recommended Workers
CPU-bound (computation)	`os.cpu_count()`(workers=0)
I/O-bound (database, network)	`os.cpu_count() * 2`
Development	`1`(single worker)
Mixed	`os.cpu_count()`(start here, tune)

Tip

Start with--workers 0(auto-detect) and monitor. Adjust based on actual CPU utilization and latency metrics.

Health Monitoring

The supervisor runs a health-check loop for multi-worker mode:

Workers that crash are restarted automatically
Maximum 5 restarts per 60-second window (prevents restart loops)
Graceful shutdown signals all workers before exit

Worker Modes

Configuring Worker Count

Single Worker (workers=1)

Auto-Detect (workers=0)

Multi-Worker (workers=2+)

Thread vs Process Workers

Thread Workers (Free-Threading)

Process Workers (GIL)

Tuning Guidelines

Health Monitoring

See Also