What is API Rate Limiting?
A strategy for limiting the number of API requests a client can make within a specified time window to protect server resources.
API rate limiting prevents abuse and ensures fair resource sharing. Common algorithms include fixed window (100 requests/minute), sliding window (smoother distribution), token bucket (allows bursts), and leaky bucket (constant rate).
Rate limits are communicated via HTTP headers: X-RateLimit-Limit, X-RateLimit-Remaining, and Retry-After. Clients should implement exponential backoff when rate limited. API keys or JWT tokens identify clients for per-user limits.