Pounce handles SIGHUP for graceful reload on supported multi-worker paths and SIGTERM / SIGINT for graceful shutdown. Both paths use connection draining: active requests get time to finish, while workers that are leaving service reject new connections.
Graceful Reload (SIGHUP)
On supported multi-worker thread and subinterpreter paths, send SIGHUP to perform a rolling restart with fresh code:
kill -HUP <pid>
# or with systemd:
systemctl reload pounce
What Happens
- Old workers continue handling existing requests
- App code is reimported and new workers spawn (generation N+1)
- Old workers enter drain mode (finish active requests, reject new ones)
- Once drained (or after
reload_timeout), old workers exit
Time 0s: [Worker-0] [Worker-1] [Worker-2] [Worker-3] (Gen 0)
SIGHUP received
Time 0.1s: [Worker-0..3 draining] [Worker-4..7 accepting] (Gen 0+1)
Time 5s: [Worker-4] [Worker-5] [Worker-6] [Worker-7] (Gen 1 only)
If the reimport fails, pounce logs the error and continues with the old code instead of swapping to the failed generation.
HTTP/3 uses a separate UDP/QUIC listener. Treat H3 reload/drain as limited until the protocol proof ledger records parity for that path.
Current subprocess proof covers SIGTERM clean exit and SIGHUP recovery to serving traffic. It does not yet prove mixed active-request drain behavior under load, so avoid describing reload as lossless across all modes and protocols.
Configuration
config = ServerConfig(
reload_timeout=60.0, # Max drain time (default: 30s)
workers=4,
)
systemd
[Service]
Type=notify
ExecStart=/usr/bin/pounce serve myapp:app --workers=4
ExecReload=/bin/kill -HUP $MAINPID
File Watching (Development)
For development, enable auto-reload on file changes:
config = ServerConfig(
reload=True,
reload_include=(".html", ".css"), # Extra extensions
reload_dirs=("templates",), # Extra directories
)
Graceful Shutdown (SIGTERM)
On SIGTERM or SIGINT, pounce drains connections then exits:
- Stops accepting new connections immediately
- Finishes active requests (up to
shutdown_timeout) - Force-terminates workers that exceed the timeout
- Exits with status 0
config = ServerConfig(
shutdown_timeout=30.0, # Per-worker drain time (default: 10s)
)
Kubernetes
spec:
containers:
- name: app
lifecycle:
preStop:
exec:
command: ["sh", "-c", "sleep 5"] # LB de-registration delay
readinessProbe:
httpGet:
path: /health
port: 8000
terminationGracePeriodSeconds: 40 # > shutdown_timeout + preStop
Key: terminationGracePeriodSeconds must exceed shutdown_timeout+ preStop delay, or Kubernetes sends SIGKILL before drain completes.
Docker
Use exec form so signals reach pounce directly:
CMD ["pounce", "serve", "myapp:app", "--host", "0.0.0.0"]
systemd
[Service]
Type=notify
KillSignal=SIGTERM
KillMode=mixed
TimeoutStopSec=40s
Thread Mode vs Process Mode
| Thread Mode (3.14t) | Process Mode (GIL) | |
|---|---|---|
| Reload | Rolling generation swap with old + new overlap | Stop/start fallback may have a brief gap |
| Shutdown | Drain per-thread | Drain per-process |
| Recommendation | Production | Acceptable for dev |
Thread mode requires Python 3.14t (free-threading). Process mode falls back to stop-all-then-start. Subinterpreter reload is explicit and beta-scoped; validate dependency compatibility before relying on it for production deploys.
Troubleshooting
Workers not draining: Increasereload_timeout or shutdown_timeout. Check for long-lived connections (WebSocket, streaming). Set request_timeoutto cap individual requests.
SIGKILL before drain complete (Kubernetes): IncreaseterminationGracePeriodSeconds to exceed shutdown_timeout+ preStop delay.
Module reload failures: Pounce logs the import error and continues with the previous version.