# Server Lifecycle

URL: /docs/deployment/lifecycle/
Section: deployment
Tags: deployment, reload, shutdown, zero-downtime

--------------------------------------------------------------------------------

Server Lifecycle Pounce handles three lifecycle signals for zero-downtime operations: SIGHUP (reload), SIGUSR1 (hot deploy), and SIGTERM (shutdown). All three use the same drain-then-replace pattern. Graceful Reload (SIGHUP) Send SIGHUP to perform a rolling restart with fresh code: kill -HUP &lt;pid&gt; # or with systemd: systemctl reload pounce What Happens Old workers continue handling existing requests App code is reimported and new workers spawn (generation N+1) Old workers enter drain mode (finish active requests, reject new ones) Once drained (or after reload_timeout), old workers exit Time 0s: [Worker-0] [Worker-1] [Worker-2] [Worker-3] (Gen 0) SIGHUP received Time 0.1s: [Worker-0..3 draining] [Worker-4..7 accepting] (Gen 0+1) Time 5s: [Worker-4] [Worker-5] [Worker-6] [Worker-7] (Gen 1 only) If the reimport fails, pounce logs the error and continues with the old code -- no downtime from bad deploys. Configuration config = ServerConfig( reload_timeout=60.0, # Max drain time (default: 30s) workers=4, ) systemd [Service] Type=notify ExecStart=/usr/bin/pounce serve myapp:app --workers=4 ExecReload=/bin/kill -HUP $MAINPID Hot Deploy (SIGUSR1) SIGUSR1 triggers the same rolling restart as SIGHUP. Use whichever signal fits your deployment tooling. kill -SIGUSR1 &lt;pid&gt; On Linux with SO_REUSEPORT, old and new workers bind to the same port simultaneously. On macOS/Windows (no SO_REUSEPORT), the AcceptDistributor handles the handoff via a shared queue. File Watching (Development) For development, enable auto-reload on file changes: config = ServerConfig( reload=True, reload_include=(&quot;.html&quot;, &quot;.css&quot;), # Extra extensions reload_dirs=(&quot;templates&quot;,), # Extra directories ) Graceful Shutdown (SIGTERM) On SIGTERM or SIGINT, pounce drains connections then exits: Stops accepting new connections immediately Finishes active requests (up to shutdown_timeout) Force-terminates workers that exceed the timeout Exits with status 0 config = ServerConfig( shutdown_timeout=30.0, # Per-worker drain time (default: 10s) ) Kubernetes spec: containers: - name: app lifecycle: preStop: exec: command: [&quot;sh&quot;, &quot;-c&quot;, &quot;sleep 5&quot;] # LB de-registration delay readinessProbe: httpGet: path: /health port: 8000 terminationGracePeriodSeconds: 40 # &gt; shutdown_timeout + preStop Key: terminationGracePeriodSeconds must exceed shutdown_timeout + preStop delay, or Kubernetes sends SIGKILL before drain completes. Docker Use exec form so signals reach pounce directly: CMD [&quot;pounce&quot;, &quot;serve&quot;, &quot;myapp:app&quot;, &quot;--host&quot;, &quot;0.0.0.0&quot;] systemd [Service] Type=notify KillSignal=SIGTERM KillMode=mixed TimeoutStopSec=40s Thread Mode vs Process Mode Thread Mode (3.14t) Process Mode (GIL) Reload True zero-downtime (old + new overlap) Brief downtime (~100-500ms) Shutdown Drain per-thread Drain per-process Recommendation Production Acceptable for dev Thread mode requires Python 3.14t (free-threading). Process mode falls back to stop-all-then-start. Troubleshooting Workers not draining: Increase reload_timeout or shutdown_timeout. Check for long-lived connections (WebSocket, streaming). Set request_timeout to cap individual requests. SIGKILL before drain complete (Kubernetes): Increase terminationGracePeriodSeconds to exceed shutdown_timeout + preStop delay. Module reload failures: Pounce logs the import error and continues with the previous version.

--------------------------------------------------------------------------------

Metadata:
- Word Count: 428
- Reading Time: 2 minutes