Rate Limiter
- Modified from - ByteByteGo courses
Questions
Who
- Who are we throttling? (IP, user ID, other properties)
What
- What kind of rate limiter are we designing? (Client-side or server-side)
- What is the scale of the system? (Startup or large company)
- What should the rate limiter support? (Different throttle rules)
When
- When will users be throttled? (Based on throttle rules)
Where
- Where will the rate limiter be deployed? (Distributed environment)
How
- How should the rate limiter be implemented? (Separate service or in application code)
- How will users be informed? (Informing throttled users)
Overview
Rate Limiter in Network Systems
- Purpose: Controls traffic rate sent by clients/services. In HTTP, it limits client requests over a specified period. Excess requests are blocked.
Benefits of API Rate Limiter:
- Prevents resource starvation from DoS attacks.
- Reduces costs by limiting excess requests, requiring fewer servers.
- Prevents server overload.
Rate Limiter Placement:
-
Client-side Implementation:
- Less reliable due to potential forgery by malicious actors.
- Limited control over client implementation.
-
Server-side Implementation:
- More reliable and secure.
-
Middleware Rate Limiter:
- Positioned between clients and API servers.
Microservices and API Gateway:
- Rate limiting often implemented in API gateways.
- API gateway functions:
- Supports rate limiting.
- SSL termination.
- Authentication.
- IP whitelisting.
- Servicing static content.
Rate Limiting Process Overview
- Core Concept:
- Use a counter to track how many requests are sent from the same user, IP address, etc.
- Disallow the request if the counter exceeds the limit.
Counter Storage:
- Database:
- Not ideal due to slow disk access.
- In-Memory Cache:
- Preferred for its speed and support for time-based expiration.
- Redis is a popular option for implementing rate limiting.
Rate Limiting Rules
- Creation and Storage:
- Rules are typically written in configuration files.
- These files are saved on disk.
- Examples:
- Maximum of 5 marketing messages per day.
- Maximum of 5 login attempts per minute.
- Loading:
- Workers frequently load these rules into the cache for quick access during request processing.
Handling Rate Limited Requests:
- HTTP Response:
- If a request is rate limited, APIs return HTTP status code 429 (Too Many Requests).
- May include the
X-Ratelimit-Retry-Afterheader indicating when the client can retry.
- Enqueueing:
- Depending on use cases, rate-limited requests may be enqueued for later processing.
Client Notifications:
- HTTP Response Headers:
- Clients receive the 429 status code and
X-Ratelimit-Retry-Afterheader. - Headers like
X-Ratelimit-Remainingcan indicate the number of remaining allowed requests before throttling.
- Clients receive the 429 status code and
Request Handling Workflow:
-
Client Request:
- Client sends a request to the server.
- The request is routed to the rate limiter middleware.
-
Rate Limiter Middleware:
- Loads rules from the cache.
- Fetches counters and last request timestamps from Redis cache.
-
Decision Making:
- Request Not Rate Limited:
- Forwarded to API servers.
- Request Rate Limited:
- Returns HTTP status code 429 (Too Many Requests) to the client.
- The request is either dropped or forwarded to a queue.
- Request Not Rate Limited:
Design Considerations for Rate Limiter:
- Technology Stack:
- Evaluate compatibility with current stack (e.g., programming language, cache service).
- Ensure efficiency in server-side rate limiting.
- Rate Limiting Algorithm:
- Server-side implementation offers full control.
- Third-party gateways may limit algorithm choices.
- Microservices Architecture:
- If using an API gateway for other functions, consider adding rate limiting.
- Engineering Resources:
- Building a custom rate limiter requires time and resources.
- Opt for commercial API gateways if resources are insufficient.
Rate limit algorithms
Rate limiter in a distributed environment
- locks will significantly slow down the system.
- Two strategies are commonly used to solve the problem: Lua script and sorted sets data structure in Redis.
Other consideration: Monitoring Rate Limiting
- Purpose:
- Ensure the effectiveness of the rate limiting algorithm and rules.
Key Metrics:
- Algorithm Effectiveness:
- Assess if the chosen algorithm is managing traffic as intended.
- Rule Effectiveness:
- Check if the rules are appropriate for current traffic patterns.
Adjustments Based on Analytics:
- Strict Rules:
- If many valid requests are dropped, consider relaxing the rules.
- Sudden Traffic Increases:
- If the rate limiter is ineffective during traffic spikes (e.g., flash sales), consider switching to an algorithm that supports burst traffic.
- Example: Token bucket algorithm is suitable for handling bursts.
Monitoring Process:
- Collect analytics data on the rate limiting system’s performance.
- Evaluate the data to identify any issues with the current setup.
- Make necessary adjustments to the rules or algorithm based on the findings.
Last updated on