OpenAI API Proxy with Preemption

A high-performance proxy for the OpenAI API that provides preemptive prioritization for requests.

Features

Priority Queuing: Assign different priorities to different endpoints
Preemptive Scheduling: Higher priority requests preempt lower priority ones
Transparent Retry: Preempted requests are automatically retried
Metrics Collection: Detailed request metrics sent to InfluxDB
Multiple Ports: Each port represents a different priority level

Configuration

The proxy is configured via a config.json file:

{
  "influxdb_url": "http://localhost:8086",
  "influx_token": "your-influx-token",
  "influx_org": "openaiorg",
  "influx_bucket": "proxybucket",
  "openai_api_url": "https://api.openai.com/v1",
  "openai_api_key": "your-openai-api-key",
  "endpoints": [
    {
      "port": 8080,
      "priority": 1,
      "preemptive": true
    },
    {
      "port": 8081,
      "priority": 2,
      "preemptive": true
    },
    {
      "port": 8082,
      "priority": 3,
      "preemptive": false
    }
  ]
}

Configuration Parameters

influxdb_url: URL of your InfluxDB instance
influx_token: Authentication token for InfluxDB
influx_org: Organization name in InfluxDB
influx_bucket: Bucket name for metrics in InfluxDB
openai_api_url: Base URL of the OpenAI API
openai_api_key: Your OpenAI API key
endpoints: Array of endpoint configurations:
- port: Port to listen on for this endpoint (each port represents a different priority)
- priority: Priority level (lower number = higher priority)
- preemptive: Whether requests on this port can preempt lower priority ones

Usage

Configure your config.json file
Run the proxy:
```
go run cmd/main.go
```

Send OpenAI API requests to the configured ports (each port serves all OpenAI API endpoints)

# High priority request
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{"model": "gpt-4", "messages": [{"role": "user", "content": "Hello"}]}'

# Medium priority request
curl -X POST http://localhost:8081/v1/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{"model": "text-davinci-003", "prompt": "Hello"}'

Metrics

The proxy collects and sends the following metrics to InfluxDB:

Model being requested
Input token count (estimated)
Processing time
Number of retries due to preemption
API endpoint path
Queue priority level
Whether the request was preempted
HTTP status code of the response
Tools requested in the API call (if any)

Development

Prerequisites

Go 1.24 or later
InfluxDB (for metrics collection)

Building

go build -o openai-proxy cmd/main.go

Running Tests

go test ./... -v

Architecture

The proxy is built around a port-based priority queue system:

Each port has its own priority queue
Higher priority queues (lower port numbers) are always processed before lower priority ones
All ports serve the full OpenAI API (chat completions, completions, embeddings, etc.)
Preemptive queues can interrupt processing of lower priority requests
Interrupted requests are automatically requeued and retried transparently

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
cmd		cmd
pkg		pkg
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.json		config.json
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

OpenAI API Proxy with Preemption

Features

Configuration

Configuration Parameters

Usage

Metrics

Development

Prerequisites

Building

Running Tests

Architecture

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

mule-ai/proxy

Folders and files

Latest commit

History

Repository files navigation

OpenAI API Proxy with Preemption

Features

Configuration

Configuration Parameters

Usage

Metrics

Development

Prerequisites

Building

Running Tests

Architecture

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages