Rate Limits
All APIs should possess rate limits that govern the amount of any digital resource or capability a consumer be able to access, with well-communicated, consistent, and enforced rate limits. Rate limits are what give API producers control over their digital resources, and are a fundamental aspect of how any type of APIs is publicly made available.
Also known as: Throttling, Quotas, Usage Limits
Example
Standards
- IETF RateLimit header fields for HTTP (draft-ietf-httpapi-ratelimit-headers)
- IETF RFC 6585 — Additional HTTP Status Codes (429 Too Many Requests)
- IETF RFC 9110 — HTTP Semantics (Retry-After §10.2.3)
- IETF RFC 9111 — HTTP Caching
- IETF WG IETF HTTP APIs Working Group
- De facto X-RateLimit-* (de facto industry convention)
HTTP Headers
| Header | Direction | Spec | Description |
|---|---|---|---|
RateLimit |
response | draft-ietf-httpapi-ratelimit-headers | Structured field conveying remaining quota and the reset interval for the current policy. |
RateLimit-Policy |
response | draft-ietf-httpapi-ratelimit-headers | Advertises one or more quota policies (limit and window) that apply to the request. |
Retry-After |
response | RFC 9110 §10.2.3 | Seconds (or HTTP-date) the client should wait before retrying after a 429 or 503. |
X-RateLimit-Limit |
response | De facto | Maximum number of requests permitted in the current window. |
X-RateLimit-Remaining |
response | De facto | Requests remaining in the current window. |
X-RateLimit-Reset |
response | De facto | Time at which the current window resets, usually as a Unix timestamp or seconds remaining. |
Status Codes
429 Too Many Requests— RFC 6585 §4 — Client has sent too many requests within a given time window.503 Service Unavailable— RFC 9110 §15.6.4 — Server-side throttling or overload; pair with Retry-After.
Media Types
application/problem+json— RFC 9457 — Recommended payload for explaining quota errors.
OpenAPI Expression
-
responses.'429'(OpenAPI 3.x) — Declare a 429 response with headers for RateLimit, RateLimit-Policy, and Retry-After. -
components.headers(OpenAPI 3.x) — Define reusable RateLimit / X-RateLimit-* header objects. -
x-ratelimit(Vendor extension) — Common provider extension for declaring tier-based quotas at the operation or document level.
Governance Rules
naftiko-rate-limits(Naftiko Sandbox (rate-limits/*.yml)) — Rules that check operations declare 429 responses and standard RateLimit headers.oas-operation-4xx-response(Spectral built-in) — Operations should document client-error responses, including 429.
Risk & Compliance
OWASP:
- OWASP API Security Top 10: API4:2023 Unrestricted Resource Consumption
Compliance:
- SOC 2 CC7.2 — system monitoring for abnormal usage
- PCI DSS v4 Req. 6.4.2 — protect public-facing applications against attacks
Security: Without enforced rate limits, APIs are vulnerable to credential stuffing, scraping, denial-of-wallet (for metered backends), and DoS. Apply per-key, per-IP, and per-tenant limits; surface quota state via standard headers; degrade gracefully with 429 + Retry-After rather than dropping connections.
Tools
- Kong Rate Limiting — Gateway plugin
- Envoy Rate Limit Service — Proxy (Apache-2.0)
- Redis Cell / GCRA — Algorithm (MIT)
- NGINX limit_req — Proxy
- Cloudflare Rate Limiting — Edge
Suggested Metrics
429_rate— Fraction of responses returning 429; spikes signal under-provisioned quota or abusive clients.quota_utilization_p95— 95th-percentile fraction of quota consumed per window, per key.throttled_clients_unique— Distinct clients hitting limits in a period; informs tier design.retry_after_compliance— Share of retrying clients that honour Retry-After before re-issuing requests.
Example Implementations
- GitHub — Primary and secondary rate limits exposed via X-RateLimit-* headers.
- Stripe — Per-account request limits with 429 responses and exponential backoff guidance.
- Twilio — Concurrency and queue-based limits across messaging APIs.
- Discord — Global and per-route buckets surfaced via X-RateLimit-Bucket and reset headers.
Related Properties
Tags
- Rate Limits
- Usage
- Constraints