Rate Limiter

Questions

How should the rate limiter be implemented? (Separate service or in application code)
How will users be informed? (Informing throttled users)

Rate Limiter in Network Systems

Purpose: Controls traffic rate sent by clients/services. In HTTP, it limits client requests over a specified period. Excess requests are blocked.

Benefits of API Rate Limiter:

Rate Limiter Placement:

Client-side Implementation:
- Less reliable due to potential forgery by malicious actors.
- Limited control over client implementation.
Server-side Implementation:
- More reliable and secure.
Middleware Rate Limiter:
- Positioned between clients and API servers.

Microservices and API Gateway:

Rate Limiting Process Overview

Core Concept:
- Use a counter to track how many requests are sent from the same user, IP address, etc.
- Disallow the request if the counter exceeds the limit.

Counter Storage:

Database:
- Not ideal due to slow disk access.
In-Memory Cache:
- Preferred for its speed and support for time-based expiration.
- Redis is a popular option for implementing rate limiting.

Rate Limiting Rules

Creation and Storage:
- Rules are typically written in configuration files.
- These files are saved on disk.
- Examples:
  - Maximum of 5 marketing messages per day.
  - Maximum of 5 login attempts per minute.
Loading:
- Workers frequently load these rules into the cache for quick access during request processing.

Handling Rate Limited Requests:

HTTP Response:
- If a request is rate limited, APIs return HTTP status code 429 (Too Many Requests).
- May include the X-Ratelimit-Retry-After header indicating when the client can retry.
Enqueueing:
- Depending on use cases, rate-limited requests may be enqueued for later processing.

Client Notifications:

HTTP Response Headers:
- Clients receive the 429 status code and X-Ratelimit-Retry-After header.
- Headers like X-Ratelimit-Remaining can indicate the number of remaining allowed requests before throttling.

Request Handling Workflow:

Client Request:
- Client sends a request to the server.
- The request is routed to the rate limiter middleware.
Rate Limiter Middleware:
- Loads rules from the cache.
- Fetches counters and last request timestamps from Redis cache.
Decision Making:
- Request Not Rate Limited:
  - Forwarded to API servers.
- Request Rate Limited:
  - Returns HTTP status code 429 (Too Many Requests) to the client.
  - The request is either dropped or forwarded to a queue.

Design Considerations for Rate Limiter:

Technology Stack:
- Evaluate compatibility with current stack (e.g., programming language, cache service).
- Ensure efficiency in server-side rate limiting.
Rate Limiting Algorithm:
- Server-side implementation offers full control.
- Third-party gateways may limit algorithm choices.
Microservices Architecture:
- If using an API gateway for other functions, consider adding rate limiting.
Engineering Resources:
- Building a custom rate limiter requires time and resources.
- Opt for commercial API gateways if resources are insufficient.

locks will significantly slow down the system.
Two strategies are commonly used to solve the problem: Lua script and sorted sets data structure in Redis.

Purpose:
- Ensure the effectiveness of the rate limiting algorithm and rules.

Key Metrics:

Algorithm Effectiveness:
- Assess if the chosen algorithm is managing traffic as intended.
Rule Effectiveness:
- Check if the rules are appropriate for current traffic patterns.

Adjustments Based on Analytics:

Strict Rules:
- If many valid requests are dropped, consider relaxing the rules.
Sudden Traffic Increases:
- If the rate limiter is ineffective during traffic spikes (e.g., flash sales), consider switching to an algorithm that supports burst traffic.
- Example: Token bucket algorithm is suitable for handling bursts.

Monitoring Process: