APIlimitering
APIlimitering is the practice of restricting the rate at which clients can make API calls to a backend service. The goal is to protect resources, prevent abuse, ensure predictable performance, and maintain service availability under load.
Core concepts include the rate limit (the maximum number of requests allowed in a given time window),
Limiter can be enforced at different layers, such as API gateways, reverse proxies, service meshes, or within
When limits are exceeded, servers typically respond with HTTP 429 Too Many Requests, sometimes accompanied by
Key design considerations include choosing the granularity (per API key, per user, per IP, or per endpoint),
Common patterns include per-key limits, per-tenant quotas, burst allowances, and adaptive throttling based on observed traffic.