Module

_rate_limiter

Rate limiting and backpressure for pounce.

Implements token bucket rate limiting per client IP with request queuing and load shedding for production overload protection.

Classes

TokenBucket 2
Token bucket rate limiter for a single client. Classic token bucket algorithm: - Tokens refill at …

Token bucket rate limiter for a single client.

Classic token bucket algorithm:

  • Tokens refill at a constant rate (requests per second)
  • Bucket has a maximum capacity (burst size)
  • Each request consumes one token
  • Requests are denied when bucket is empty

Thread-safe for free-threading mode.

Methods

consume 0 bool
Try to consume one token.
def consume(self) -> bool
Returns
bool True if token was available, False if rate limited
Internal Methods 1
__init__ 2
Initialize token bucket.
def __init__(self, rate: float, burst: int) -> None
Parameters
Name Type Description
rate

Tokens per second to refill

burst

Maximum tokens (burst capacity)

RateLimiter 3
Per-IP rate limiter with token buckets. Tracks rate limits per client IP address using token bucke…

Per-IP rate limiter with token buckets.

Tracks rate limits per client IP address using token bucket algorithm. Automatically cleans up stale buckets to prevent memory leaks.

Thread-safe for concurrent worker threads.

Methods

check_rate_limit 1 bool
Check if request is rate limited.
def check_rate_limit(self, client_ip: str) -> bool
Parameters
Name Type Description
client_ip

Client IP address

Returns
bool True if request is allowed, False if rate limited
Internal Methods 2
__init__ 2
Initialize rate limiter.
def __init__(self, rate: float, burst: int) -> None
Parameters
Name Type Description
rate

Requests per second allowed per IP

burst

Maximum burst size per IP

_maybe_cleanup 0
Clean up stale buckets to prevent memory leaks. Removes buckets that are full …
def _maybe_cleanup(self) -> None

Clean up stale buckets to prevent memory leaks.

Removes buckets that are full (no recent activity).

Functions

create_rate_limit_wrapper 2 Callable
Wrap an ASGI app with rate limiting. Intercepts requests and applies rate limi…
def create_rate_limit_wrapper(app: Callable, rate_limiter: RateLimiter) -> Callable

Wrap an ASGI app with rate limiting.

Intercepts requests and applies rate limiting before passing to app. Returns 429 Too Many Requests when rate limit is exceeded.

Parameters
Name Type Description
app Callable

Original ASGI app

rate_limiter RateLimiter

RateLimiter instance

Returns
Callable