# Workers

URL: /docs/deployment/workers/
Section: deployment
Tags: workers, threading, processes, parallelism

--------------------------------------------------------------------------------

Worker Modes Pounce automatically selects the worker mode based on the Python runtime: Runtime Workers Are Detection Python 3.14t (free-threading) Threads sys._is_gil_enabled() returns False Standard CPython (GIL) Processes sys._is_gil_enabled() returns True You don't need to configure this — the supervisor detects it at startup. Configuring Worker Count # Single worker (no supervisor, lowest overhead) pounce myapp:app --workers 1 # Auto-detect from CPU cores pounce myapp:app --workers 0 # Explicit count pounce myapp:app --workers 4 Single Worker (workers=1) The server skips the supervisor entirely and runs the worker directly in the main thread. This is the lowest-overhead mode — no supervisor thread, no health monitoring. Good for development and low-traffic services. Auto-Detect (workers=0) Calls os.cpu_count() and uses that as the worker count (minimum 1). On an 8-core machine, this creates 8 workers. Multi-Worker (workers=2+) The supervisor spawns N workers and monitors their health. Workers that crash are automatically restarted (max 5 restarts per 60-second window). Thread vs Process Workers flowchart LR subgraph ft [&quot;Free-Threading (3.14t)&quot;] direction TB P1[&quot;1 Process&quot;] --&gt; T1[&quot;Thread 1\nevent loop&quot;] P1 --&gt; T2[&quot;Thread 2\nevent loop&quot;] P1 --&gt; TN[&quot;Thread N\nevent loop&quot;] T1 ~~~ Mem1[&quot;Shared Memory\nconfig, app, routes&quot;] T2 ~~~ Mem1 TN ~~~ Mem1 end subgraph gil [&quot;GIL Build&quot;] direction TB Sup[&quot;Supervisor&quot;] --&gt; PR1[&quot;Process 1\nevent loop&quot;] Sup --&gt; PR2[&quot;Process 2\nevent loop&quot;] Sup --&gt; PRN[&quot;Process N\nevent loop&quot;] PR1 ~~~ Mem2[&quot;Isolated Memory ×N&quot;] end Thread Workers (Free-Threading) On Python 3.14t, each worker is a thread with its own asyncio event loop: Shared memory — All workers see the same ServerConfig, app reference, and route tables No IPC — Threads communicate through shared memory, not pipes Lower memory — One copy of the application, not N copies SO_REUSEPORT — Kernel distributes incoming connections across workers Process Workers (GIL) On GIL builds, each worker is a separate process: Isolated memory — Each process has its own copy of everything Fork-based — Similar to Uvicorn/Gunicorn multi-process model Higher memory — N copies of the application Tuning Guidelines Workload Recommended Workers CPU-bound (computation) os.cpu_count() (workers=0) I/O-bound (database, network) os.cpu_count() * 2 Development 1 (single worker) Mixed os.cpu_count() (start here, tune) Tip Tip Start with --workers 0 (auto-detect) and monitor. Adjust based on actual CPU utilization and latency metrics. Health Monitoring The supervisor runs a health-check loop for multi-worker mode: Workers that crash are restarted automatically Maximum 5 restarts per 60-second window (prevents restart loops) Graceful shutdown signals all workers before exit See Also Thread Safety — Shared state model Architecture — Supervisor design Production — Full production setup

--------------------------------------------------------------------------------

Metadata:
- Word Count: 412
- Reading Time: 2 minutes