Skip to content

In-memory rate limiting doesn't work with multiple server instances #7

@AdamEXu

Description

@AdamEXu

Problem

Rate limiting is implemented using in-memory data structures (defaultdict(deque)) in utils/rate_limiter.py:

self._requests = defaultdict(deque)  # api_key -> deque of timestamps

Issues

  1. Lost on restart - All rate limit data is cleared when server restarts
  2. No shared state - Won't work with multiple server instances (load balancer, horizontal scaling)
  3. Easy bypass - Restart the server to reset rate limits
  4. No persistence - Can't audit historical rate limit violations

Current Impact

Low (single instance deployment), but Critical if we scale horizontally

Recommendations

Option 1: Redis-backed rate limiting

import redis
redis_client = redis.Redis(host='localhost', port=6379)

def is_allowed(self, api_key: str) -> tuple[bool, Dict]:
    # Use Redis sorted sets for sliding window
    current_time = time.time()
    window_start = current_time - 60
    
    # Remove old entries
    redis_client.zremrangebyscore(f'ratelimit:{api_key}', 0, window_start)
    
    # Count current requests
    current_count = redis_client.zcard(f'ratelimit:{api_key}')
    
    # Check limit...

Option 2: Database-backed (SQLite/Teable)

Use ephemeral SQLite table with TTL

Option 3: Hybrid approach

  • Use in-memory for performance
  • Sync to Redis/DB periodically
  • Fallback to DB on startup

Labels

enhancement, scalability, production

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions