feat: add rate limiting middleware for production stability #46

tushar-xo · 2025-12-12T04:51:20Z

PR Description
Title: feat: add rate limiting middleware for production stability

Context
Adding a Rate Limiting Middleware to protect the agent server from abuse, brute-force attacks on auth endpoints, and resource exhaustion.

The API currently lacks visible rate limiting. This PR implements a production-ready middleware that supports:

Memcached (via aiomcache for high performance)
Redis (leveraging existing bindu infrastructure)
In-Memory fallback (for development)
This aligns with the goal of making bindu robust and "production-ready" out of the box.

Changes
bindu/server/middleware/rate_limit.py : Core middleware logic with pluggable backends.
bindu/settings.py : Added RateLimitSettings configuration.
bindu/server/applications.py : Integrated middleware at the top of the chain (protected endpoints).
pyproject.toml : Added aiomcache dependency.
tests/unit/server/middleware/test_rate_limit.py : Comprehensive unit tests.

Testing Notes
✅ Unit Tests: Added
tests/unit/server/middleware/test_rate_limit.py
covering Memory, Redis (mock), Memcached (mock) backends and middleware logic.
✅ Manual Verification: Verified X-RateLimit-* headers are returned correctly and 429 status code is triggered when limit is exceeded.
✅ Fail-Open: verified that backend failures do not crash the request (logs error and allows request).

Copilot

Pull request overview

This PR adds a rate limiting middleware to protect the Bindu server from abuse and resource exhaustion. The implementation provides three backend options (Memory, Redis, Memcached) with configurable limits and a fail-open design that allows requests to proceed if the rate limiting backend encounters errors.

Key Changes:

Core rate limiting middleware with pluggable backend support
Configuration settings for enabling/disabling rate limiting and selecting backends
Unit tests covering backend implementations and middleware behavior

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 17 comments.

Show a summary per file

File	Description
bindu/server/middleware/rate_limit.py	Implements rate limiting middleware with Memory, Redis, and Memcached backend support
bindu/settings.py	Adds RateLimitSettings configuration class with backend selection and limits
bindu/server/middleware/init.py	Exports RateLimitMiddleware for use in applications
bindu/server/applications.py	Integrates rate limit middleware into the application middleware stack
pyproject.toml	Adds aiomcache dependency for Memcached support
uv.lock	Updates lock file with aiomcache package and dependencies
tests/unit/server/middleware/test_rate_limit.py	Provides unit tests for backends and middleware functionality

Comments suppressed due to low confidence (1)

bindu/server/middleware/rate_limit.py:104

Except block directly handles BaseException.

                 except:

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-12T04:57:06Z

bindu/server/middleware/rate_limit.py

+    async def increment(self, key: str, window: int) -> int:
+        now = int(time.time())
+        if key not in self._counts:
+            self._counts[key] = []
+
+        # Filter out old requests
+        self._counts[key] = [t for t in self._counts[key] if t > now - window]
+        self._counts[key].append(now)
+
+        return len(self._counts[key])


The MemoryBackend is not thread-safe or process-safe. Concurrent requests could lead to race conditions when filtering old timestamps and appending new ones. Consider using threading locks or noting this limitation in the documentation.

Copilot · 2025-12-12T04:57:06Z

bindu/server/middleware/rate_limit.py

+        # Identify client (simple IP based for now, can be extended to API key)
+        client_ip = request.client.host if request.client else "unknown"
+        key = f"ratelimit:{client_ip}"


Using only client IP for rate limiting can be easily bypassed by attackers using proxies or VPNs. Additionally, this could unfairly limit legitimate users behind shared NATs or corporate proxies. Consider adding support for rate limiting based on API keys, authentication tokens, or implementing a more sophisticated key generation strategy that combines multiple factors.

Suggested change

# Identify client (simple IP based for now, can be extended to API key)

client_ip = request.client.host if request.client else "unknown"

key = f"ratelimit:{client_ip}"

# Identify client using API key, auth token, or IP address

api_key = request.headers.get("X-API-Key")

auth_header = request.headers.get("Authorization")

if api_key:

identifier = f"apikey:{api_key}"

elif auth_header:

identifier = f"auth:{auth_header}"

else:

client_ip = request.client.host if request.client else "unknown"

identifier = f"ip:{client_ip}"

key = f"ratelimit:{identifier}"

Copilot · 2025-12-12T04:57:06Z

bindu/server/middleware/rate_limit.py

+                 try:
+                    val = await self._client.incr(b_key, 1)
+                    return val or 1 # Fallback
+                 except: 


Bare except clause will catch all exceptions including SystemExit and KeyboardInterrupt. This should specify the exception type (e.g., 'except Exception:') to avoid catching exceptions that should not be suppressed.

Suggested change

except:

except Exception:

Copilot · 2025-12-12T04:57:07Z

bindu/server/middleware/rate_limit.py

+            if current_count > self.limit:
+                logger.warning(f"Rate limit exceeded for {client_ip}")
+                return JSONResponse(
+                    {"error": "Too Many Requests", "retry_after": self.window},
+                    status_code=429,
+                    headers=headers
+                )


The rate limiting logic checks 'current_count > self.limit' which means a client can make (limit + 1) requests before being blocked. The first request that exceeds the limit still gets the count incremented. This should be 'current_count > self.limit' with the check happening before incrementing, or '>=' to enforce the exact limit.

Copilot · 2025-12-12T04:57:07Z

bindu/server/middleware/rate_limit.py

+class MemoryBackend(RateLimitBackend):
+    """In-memory rate limit backend (not shared across processes)."""
+
+    def __init__(self):
+        self._counts: dict[str, list[int]] = {}
+
+    async def increment(self, key: str, window: int) -> int:
+        now = int(time.time())
+        if key not in self._counts:
+            self._counts[key] = []
+
+        # Filter out old requests
+        self._counts[key] = [t for t in self._counts[key] if t > now - window]
+        self._counts[key].append(now)
+
+        return len(self._counts[key])
+
+
+class RedisBackend(RateLimitBackend):
+    """Redis-based rate limit backend."""
+
+    def __init__(self, redis_url: str):
+        import redis.asyncio as redis
+        self._redis = redis.from_url(redis_url, encoding="utf-8", decode_responses=True)
+
+    async def increment(self, key: str, window: int) -> int:
+        pipeline = self._redis.pipeline()
+        pipeline.incr(key)
+        pipeline.expire(key, window)
+        result = await pipeline.execute()
+        return result[0]
+
+
+class MemcachedBackend(RateLimitBackend):
+    """Memcached-based rate limit backend."""
+
+    def __init__(self, host: str, port: int):
+        import aiomcache
+        self._client = aiomcache.Client(host, port)
+
+    async def increment(self, key: str, window: int) -> int:
+        # Memcached incr doesn't create key if not exists, so we need add/incr logic
+        # However, aiomcache implementation of incr is a bit specific
+        # We'll use a simple strategy: try incr, if fails (returns None), set to 1
+
+        # Note: aiomcache key must be bytes
+        b_key = key.encode('utf-8')
+
+        try:
+            val = await self._client.incr(b_key, 1)
+        except Exception:
+             val = None
+
+        if val is None:
+            # Key might not exist, try adding it
+            try:
+                await self._client.add(b_key, b"1", exptime=window)
+                return 1
+            except Exception:
+                # If add fails, it might have been created concurrently, try incr again
+                 try:
+                    val = await self._client.incr(b_key, 1)
+                    return val or 1 # Fallback
+                 except: 
+                     return 1
+
+        return val


Missing documentation for the MemoryBackend, RedisBackend, and MemcachedBackend classes. These public classes should have docstrings explaining their purpose, limitations (e.g., MemoryBackend is not process-safe), and usage examples to help users understand which backend to choose.

Copilot · 2025-12-12T04:57:09Z

tests/unit/server/middleware/test_rate_limit.py

+    response = await middleware.dispatch(request, call_next)
+    assert response.status_code == 429
+    assert response.headers["X-RateLimit-Remaining"] == "0"
+    import json


The import of 'json' is placed in the middle of test code. It should be moved to the top of the file with other imports following PEP 8 conventions.

Copilot · 2025-12-12T04:57:09Z

bindu/server/middleware/rate_limit.py

+        pipeline = self._redis.pipeline()
+        pipeline.incr(key)
+        pipeline.expire(key, window)
+        result = await pipeline.execute()
+        return result[0]


The RedisBackend.increment method does not handle the race condition where the key might expire between the incr and expire commands. If the key expires naturally between these two operations, the expire command will fail silently. Consider using INCR with EXPIRE in an atomic transaction or using a different approach with GETEX or other atomic operations.

Suggested change

pipeline = self._redis.pipeline()

pipeline.incr(key)

pipeline.expire(key, window)

result = await pipeline.execute()

return result[0]

# Use a Lua script to atomically increment and set expiry

lua_script = """

local current

current = redis.call('INCR', KEYS[1])

if redis.call('TTL', KEYS[1]) == -1 then

redis.call('EXPIRE', KEYS[1], ARGV[1])

end

return current

"""

result = await self._redis.eval(lua_script, 1, key, window)

return int(result)

Copilot · 2025-12-12T04:57:09Z

tests/unit/server/middleware/test_rate_limit.py

+    app_settings.rate_limit.enabled = False
+    middleware = RateLimitMiddleware(mock_app)
+
+    scope = {"type": "http", "client": ("127.0.0.1", 1234)}


Variable scope is not used.

Suggested change

scope = {"type": "http", "client": ("127.0.0.1", 1234)}

Copilot · 2025-12-12T04:57:09Z

bindu/server/middleware/rate_limit.py

+
+import time
+from abc import ABC, abstractmethod
+from typing import Any


Import of 'Any' is not used.

Suggested change

from typing import Any

Copilot · 2025-12-12T04:57:10Z

tests/unit/server/middleware/test_rate_limit.py

+from unittest.mock import AsyncMock, MagicMock
+from starlette.requests import Request
+from starlette.responses import JSONResponse
+from starlette.types import ASGIApp, Receive, Scope, Send


Import of 'ASGIApp' is not used.

Suggested change

from starlette.types import ASGIApp, Receive, Scope, Send

from starlette.types import Receive, Scope, Send

raahulrahl

Tushar - great progress - but lets keep it simple first.

feat: add rate limiting middleware for production stability

d3bb632

Copilot AI review requested due to automatic review settings December 12, 2025 04:51

Copilot started reviewing on behalf of tushar-xo December 12, 2025 04:51 View session

Copilot AI reviewed Dec 12, 2025

View reviewed changes

raahulrahl reviewed Dec 12, 2025

View reviewed changes

raahulrahl marked this pull request as draft December 12, 2025 19:04

raahulrahl self-assigned this Dec 12, 2025

raahulrahl added the enhancement New feature or request label Dec 12, 2025

-        # Identify client (simple IP based for now, can be extended to API key)
-        client_ip = request.client.host if request.client else "unknown"
-        key = f"ratelimit:{client_ip}"
+        # Identify client using API key, auth token, or IP address
+        api_key = request.headers.get("X-API-Key")
+        auth_header = request.headers.get("Authorization")
+        if api_key:
+            identifier = f"apikey:{api_key}"
+        elif auth_header:
+            identifier = f"auth:{auth_header}"
+        else:
+            client_ip = request.client.host if request.client else "unknown"
+            identifier = f"ip:{client_ip}"
+        key = f"ratelimit:{identifier}"

-        pipeline = self._redis.pipeline()
-        pipeline.incr(key)
-        pipeline.expire(key, window)
-        result = await pipeline.execute()
-        return result[0]
+        # Use a Lua script to atomically increment and set expiry
+        lua_script = """
+        local current
+        current = redis.call('INCR', KEYS[1])
+        if redis.call('TTL', KEYS[1]) == -1 then
+            redis.call('EXPIRE', KEYS[1], ARGV[1])
+        end
+        return current
+        """
+        result = await self._redis.eval(lua_script, 1, key, window)
+        return int(result)

	from starlette.types import ASGIApp, Receive, Scope, Send
	from starlette.types import Receive, Scope, Send

feat: add rate limiting middleware for production stability #46

Are you sure you want to change the base?

feat: add rate limiting middleware for production stability #46

Uh oh!

Conversation

tushar-xo commented Dec 12, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

raahulrahl left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants