What is Rate Limiting?
A technique that controls the number of requests a client can make to a server within a specified time period.
Rate limiting protects APIs and services from abuse, DDoS attacks, and excessive usage. Common approaches include fixed window (X requests per minute), sliding window (smoother distribution), and token bucket (burst-friendly) algorithms.
Implementation can be per-user, per-IP, or per-API-key. HTTP response headers (X-RateLimit-Limit, X-RateLimit-Remaining, Retry-After) communicate limits to clients. Redis is commonly used to track request counts.