Pounce handles three lifecycle signals for zero-downtime operations: SIGHUP (reload), SIGUSR1 (hot deploy), and SIGTERM (shutdown). All three use the same drain-then-replace pattern.
Graceful Reload (SIGHUP)
Send SIGHUP to perform a rolling restart with fresh code:
kill -HUP <pid>
# or with systemd:
systemctl reload pounce
What Happens
- Old workers continue handling existing requests
- App code is reimported and new workers spawn (generation N+1)
- Old workers enter drain mode (finish active requests, reject new ones)
- Once drained (or after
reload_timeout), old workers exit
Time 0s: [Worker-0] [Worker-1] [Worker-2] [Worker-3] (Gen 0)
SIGHUP received
Time 0.1s: [Worker-0..3 draining] [Worker-4..7 accepting] (Gen 0+1)
Time 5s: [Worker-4] [Worker-5] [Worker-6] [Worker-7] (Gen 1 only)
If the reimport fails, pounce logs the error and continues with the old code -- no downtime from bad deploys.
Configuration
config = ServerConfig(
reload_timeout=60.0, # Max drain time (default: 30s)
workers=4,
)
systemd
[Service]
Type=notify
ExecStart=/usr/bin/pounce serve myapp:app --workers=4
ExecReload=/bin/kill -HUP $MAINPID
Hot Deploy (SIGUSR1)
SIGUSR1 triggers the same rolling restart as SIGHUP. Use whichever signal fits your deployment tooling.
kill -SIGUSR1 <pid>
On Linux with SO_REUSEPORT, old and new workers bind to the same port simultaneously. On macOS/Windows (no SO_REUSEPORT), the AcceptDistributor handles the handoff via a shared queue.
File Watching (Development)
For development, enable auto-reload on file changes:
config = ServerConfig(
reload=True,
reload_include=(".html", ".css"), # Extra extensions
reload_dirs=("templates",), # Extra directories
)
Graceful Shutdown (SIGTERM)
On SIGTERM or SIGINT, pounce drains connections then exits:
- Stops accepting new connections immediately
- Finishes active requests (up to
shutdown_timeout) - Force-terminates workers that exceed the timeout
- Exits with status 0
config = ServerConfig(
shutdown_timeout=30.0, # Per-worker drain time (default: 10s)
)
Kubernetes
spec:
containers:
- name: app
lifecycle:
preStop:
exec:
command: ["sh", "-c", "sleep 5"] # LB de-registration delay
readinessProbe:
httpGet:
path: /health
port: 8000
terminationGracePeriodSeconds: 40 # > shutdown_timeout + preStop
Key: terminationGracePeriodSeconds must exceed shutdown_timeout+ preStop delay, or Kubernetes sends SIGKILL before drain completes.
Docker
Use exec form so signals reach pounce directly:
CMD ["pounce", "serve", "myapp:app", "--host", "0.0.0.0"]
systemd
[Service]
Type=notify
KillSignal=SIGTERM
KillMode=mixed
TimeoutStopSec=40s
Thread Mode vs Process Mode
| Thread Mode (3.14t) | Process Mode (GIL) | |
|---|---|---|
| Reload | True zero-downtime (old + new overlap) | Brief downtime (~100-500ms) |
| Shutdown | Drain per-thread | Drain per-process |
| Recommendation | Production | Acceptable for dev |
Thread mode requires Python 3.14t (free-threading). Process mode falls back to stop-all-then-start.
Troubleshooting
Workers not draining: Increasereload_timeout or shutdown_timeout. Check for long-lived connections (WebSocket, streaming). Set request_timeoutto cap individual requests.
SIGKILL before drain complete (Kubernetes): IncreaseterminationGracePeriodSeconds to exceed shutdown_timeout+ preStop delay.
Module reload failures: Pounce logs the import error and continues with the previous version.