Skip to content

Add the ARES proxy#84

Merged
joshgreaves merged 6 commits intomainfrom
ares-proxy
Feb 3, 2026
Merged

Add the ARES proxy#84
joshgreaves merged 6 commits intomainfrom
ares-proxy

Conversation

@joshgreaves
Copy link
Contributor

@joshgreaves joshgreaves commented Feb 3, 2026

User description

This PR adds a lightweight reverse proxy. This is part of our larger design to allow agents which run inside sandboxes, while still getting interventions on every LLM request.


Generated description

Below is a concise technical summary of the changes proposed in this PR:

graph LR
main_("main"):::added
NewBroker_("NewBroker"):::added
handleChatCompletion_("handleChatCompletion"):::added
SubmitRequest_("SubmitRequest"):::added
handlePoll_("handlePoll"):::added
PollRequests_("PollRequests"):::added
handleRespond_("handleRespond"):::added
RespondToRequest_("RespondToRequest"):::added
main_ -- "Creates Broker with configured timeout for coordinating requests." --> NewBroker_
handleChatCompletion_ -- "Enqueues client JSON body, waits for JSON response or timeout." --> SubmitRequest_
handlePoll_ -- "Returns pending requests (id,request,timestamp) and clears queue." --> PollRequests_
handleRespond_ -- "Delivers provided JSON response to request by ID." --> RespondToRequest_
classDef added stroke:#15AA7A
classDef removed stroke:#CD5270
classDef modified stroke:#EDAC4C
linkStyle default stroke:#CBD5E1,font-size:13px
Loading

Implements a lightweight reverse proxy that intercepts OpenAI-compatible chat completion requests and manages them through a queue-mediated polling system. This service enables external controllers to intercept and respond to LLM requests from sandboxed agents, facilitating RL-based interventions.

TopicDetails
Infrastructure Establishes the supporting infrastructure including environment-based configuration, unit testing suite, and GitHub Actions CI workflow.
Modified files (7)
  • .github/workflows/go-tests.yml
  • .gitignore
  • ares-proxy/Makefile
  • ares-proxy/broker_test.go
  • ares-proxy/config.go
  • ares-proxy/go.mod
  • ares-proxy/go.sum
Latest Contributors(2)
UserCommitDate
joshua.greaves@gmail.comAdd-example-1-and-hand...January 09, 2026
mwilliammyers@gmail.comInitial-commitDecember 17, 2025
Proxy Core Logic Develops the core request-response coordination logic and HTTP endpoints for queuing and polling LLM requests.
Modified files (4)
  • ares-proxy/README.md
  • ares-proxy/broker.go
  • ares-proxy/main.go
  • ares-proxy/types.go
Latest Contributors(0)
UserCommitDate
This pull request is reviewed by Baz. Review like a pro on (Baz).

Comment on lines +19 to +24
// Set up HTTP routes
http.HandleFunc("/v1/chat/completions", handleChatCompletion(broker))
http.HandleFunc("/poll", handlePoll(broker))
http.HandleFunc("/respond", handleRespond(broker))

// Start the server
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/poll and /respond are registered with no auth middleware, so any network client can enumerate queued requests or inject responses and bypass the controller; can we enforce an API key/mTLS check or otherwise gate these handlers?

Prompt for AI Agents:

In ares-proxy/main.go around lines 19 to 24 where the HTTP routes are registered (the
logic that sets up /v1/chat/completions, /poll, and /respond), the /poll and /respond
handlers are exposed with no authentication. Add an authentication middleware and apply
it to only /poll and /respond: implement a middleware that checks either (a) a
configured API key header (e.g. X-API-Key matched against config.APIKey) or (b) verifies
client TLS certificates when TLS is enabled; return 401 for missing/invalid credentials
and 403 for unauthorized. Wire the middleware by replacing http.HandleFunc("/poll",
handlePoll(broker)) and http.HandleFunc("/respond", handleRespond(broker)) with
http.Handle("/poll", authMiddleware(handlePoll(broker))) and similar for /respond, and
add configuration loading for the API key (or a flag to enable mTLS) so the behavior is
testable; keep /v1/chat/completions unchanged. Ensure responses reveal no sensitive
details in error messages.

Fix in Cursor

Finding type: Basic Security Patterns

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine; auth is handled at a layer above

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks for clarifying that auth is handled upstream—I'll remember this assumption for similar findings and save it to memory after the PR is merged.

Comment on lines +41 to +45
// Read the request body
body, err := io.ReadAll(r.Body)
if err != nil {
http.Error(w, fmt.Sprintf("Failed to read request: %v", err), http.StatusBadRequest)
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handleChatCompletion calls io.ReadAll(r.Body) without capping the reader, so an attacker can send an arbitrarily large /v1/chat/completions payload and exhaust server memory; can we wrap r.Body with http.MaxBytesReader or otherwise enforce a max body size?

Suggested change
// Read the request body
body, err := io.ReadAll(r.Body)
if err != nil {
http.Error(w, fmt.Sprintf("Failed to read request: %v", err), http.StatusBadRequest)
return
// Read the request body with a size limit to prevent excessive memory usage
const maxChatRequestSize = 1 << 20 // 1MB
body, err := io.ReadAll(http.MaxBytesReader(w, r.Body, maxChatRequestSize))
if err != nil {
http.Error(w, fmt.Sprintf("Failed to read request: %v", err), http.StatusBadRequest)
return

Finding type: Basic Security Patterns

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, I’ll take this into account and remember it for similar findings after the PR is merged.

Comment on lines +92 to +96
// Parse the request body
var req RespondRequest
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
http.Error(w, fmt.Sprintf("Invalid JSON: %v", err), http.StatusBadRequest)
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handleRespond decodes JSON directly from r.Body with no size guard, so an attacker can POST an unbounded body to /respond and force the proxy to allocate arbitrarily large memory; can we enforce a max body size before decoding?

Suggested change
// Parse the request body
var req RespondRequest
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
http.Error(w, fmt.Sprintf("Invalid JSON: %v", err), http.StatusBadRequest)
return
// Parse the request body with a size limit (e.g., 1MB)
const maxRespondBodySize = 1 << 20
var req RespondRequest
limitedBody := http.MaxBytesReader(w, r.Body, maxRespondBodySize)
if err := json.NewDecoder(limitedBody).Decode(&req); err != nil {
http.Error(w, fmt.Sprintf("Invalid JSON: %v", err), http.StatusBadRequest)
return

Finding type: Basic Security Patterns

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary.

@joshgreaves joshgreaves requested a review from rsmith49 February 3, 2026 22:49
Comment on lines +20 to +22
http.HandleFunc("/v1/chat/completions", handleChatCompletion(broker))
http.HandleFunc("/poll", handlePoll(broker))
http.HandleFunc("/respond", handleRespond(broker))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical

[Security] All three HTTP endpoints are exposed without any authentication or authorization checks, meaning any process able to reach the proxy can read queued LLM prompts via /poll or inject arbitrary responses via /respond. That’s a high‑risk data exfiltration and command‑injection vector for the sandboxed agents this proxy is meant to protect. Please gate these handlers behind at least a shared secret/API key that’s checked before servicing the request (e.g., compare an X-ARES-Token header to a secret loaded from config) so only the trusted controller can poll/respond.

Context for Agents
All three HTTP endpoints are exposed without any authentication or authorization checks, meaning *any* process able to reach the proxy can read queued LLM prompts via `/poll` or inject arbitrary responses via `/respond`. That’s a high‑risk data exfiltration and command‑injection vector for the sandboxed agents this proxy is meant to protect. Please gate these handlers behind at least a shared secret/API key that’s checked before servicing the request (e.g., compare an `X-ARES-Token` header to a secret loaded from config) so only the trusted controller can poll/respond. [FileRule: CLAUDE.md]

File: ares-proxy/main.go
Line: 22

💡 Applied from Instructions File: CLAUDE.md

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine; auth is handled at a layer above

Comment on lines +41 to +45
// Read the request body
body, err := io.ReadAll(r.Body)
if err != nil {
http.Error(w, fmt.Sprintf("Failed to read request: %v", err), http.StatusBadRequest)
return

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

[Security] Both handleChatCompletion (lines 41–47) and handleRespond (lines 92–97) read request bodies without any size limit. An attacker can send an arbitrarily large payload and force the proxy to allocate unbounded memory (DoS). Before calling io.ReadAll/json.NewDecoder, wrap r.Body with http.MaxBytesReader (pick a sane upper bound for chat payloads, e.g., a few MB) so incoming bodies are rejected once they exceed the limit. This keeps the proxy responsive under untrusted network conditions.

Context for Agents
Both `handleChatCompletion` (lines 41–47) and `handleRespond` (lines 92–97) read request bodies without any size limit. An attacker can send an arbitrarily large payload and force the proxy to allocate unbounded memory (DoS). Before calling `io.ReadAll`/`json.NewDecoder`, wrap `r.Body` with `http.MaxBytesReader` (pick a sane upper bound for chat payloads, e.g., a few MB) so incoming bodies are rejected once they exceed the limit. This keeps the proxy responsive under untrusted network conditions. [FileRule: CLAUDE.md]

File: ares-proxy/main.go
Line: 45

💡 Applied from Instructions File: CLAUDE.md

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary.

Comment on lines +88 to +101
// PollRequests returns all pending requests and clears the queue
// This is called by the /poll endpoint
func (b *Broker) PollRequests() []PendingRequest {
b.mu.Lock()
defer b.mu.Unlock()

// Return a copy of the queue
requests := make([]PendingRequest, len(b.requestQueue))
copy(requests, b.requestQueue)

// Clear the queue
b.requestQueue = make([]PendingRequest, 0)

return requests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PollRequests clears requestQueue each call and there is no path to re-enqueue pending entries, so any request that is polled but never /responded (controller crash) is permanently hidden from other controllers while its client waits for timeout; can we keep/requeue unresponded requests so they can be picked up again?

Prompt for AI Agents:

In ares-proxy/broker.go around lines 88 to 101, the PollRequests method clears
b.requestQueue on each call which permanently hides requests that were polled but never
responded to. Change PollRequests to implement a leasing pattern: instead of clearing
the queue, move polled requests into a new in-flight map (e.g.,
map[string]InFlight{request, leaseExpiry}) while returning the copy to the caller. Also
add logic to re-enqueue any in-flight entries whose leaseExpiry has passed (either in
PollRequests before returning or via a small background goroutine), and protect the new
in-flight map with the existing mutex. This ensures requests that were polled but never
responded become available to other controllers after the lease expires.

Fix in Cursor

Finding type: Logical Bugs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not worry about this. We will do exactly-once delivery. If the client fails to respond it will lead to shut down of the sandbox from the ARES side anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, I’ll treat this behavior as intentional given the exactly-once delivery and sandbox shutdown semantics, and I’ll remember this after the PR is merged.

Copy link
Contributor

@rsmith49 rsmith49 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read through the README + live discussion and overall looks good! Should probably wait on a full lookthrough from @mwilliammyers to make sure nothing obviously bad is happening though ( especially since also not sure how good LLMs are at golang in general)

@@ -0,0 +1,224 @@
# ares-proxy

A lightweight HTTP proxy server that intercepts OpenAI-compatible chat completion requests and routes them through a queue-mediated polling system. Designed for use with the ARES (Agentic Research and Evaluation Suite) framework to enable RL-based control of LLM interactions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenAI-compatible

We should specify that this proxy server captures requests in any format

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

The core coordination engine that manages:
- **Request queue**: Holds pending LLM requests
- **Response channels**: Maps request IDs to response delivery channels
- **Timeout handling**: Cleans up stale requests after a configurable timeout
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this stale request cleanup look like from the Code Agent's perspective?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just removes them; for now I don't have a clear idea of what this should look like. I'm hoping as I start integrating things this will be more clear.


#### HTTP Endpoints (`main.go`)

1. **`POST /v1/chat/completions`**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need messages and responses as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, eventually. I'm going to start with the simplest.

1. **`POST /v1/chat/completions`**
- OpenAI-compatible endpoint for LLM clients
- Accepts standard chat completion requests
- Blocks until a response is available or timeout occurs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Where is this timeout configured from the user perspective?
  • Does it align with the timeout from the Code Agent?
  • Can this be adjusted at runtime based on average model response time? (not now, but potentially in future)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment it's set to be longer than the timeout from the ARES agent. I'll revisit this later as I come to integrating it into ARES.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently it can be set in-line TIMEOUT_MINUTES=30 ./ares-proxy

@joshgreaves joshgreaves merged commit 519e893 into main Feb 3, 2026
4 of 5 checks passed
@joshgreaves joshgreaves deleted the ares-proxy branch February 3, 2026 23:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants