Pokémon Red, fully controlled by an LLM 🤖🎮
This repo contains a minimal, hack‑able agent that teaches large language models to play Pokémon Red inside the PyBoy Game Boy emulator.
Forked from the excellent
portalcorp/ClaudePlaysPokemon
and extended with the OpenAI Responses API so it can run both the o3
and o4‑mini models alongside Anthropic Claude. Anthropic remains the default
provider (see --provider flag below).
Project by Lander Media / Steve Moraco. Initial agent code by o4‑mini.
- Declarative function‑calling interface – the model calls the tools
press_buttonsandnavigate_to(path‑finding helper enabled by default). - Screenshot‑based gameplay – what the model “sees” is precisely what is on the screen, delivered as a PNG each step (hex‑encoded over WebSocket).
- FastAPI + WebSockets live UI – watch the game, pause, resume, load save
states, and inspect the model’s thoughts in real time at
http://localhost:<port>. - Automatic log folder per run (frames, model messages, structured game log).
- Context summarisation to keep the conversation within token limits.
-
Clone this repository:
git clone <repo-url> cd <repo-directory>
-
Install Python dependencies (Python ≥3.10 recommended):
pip install -r requirements.txt
-
Provide an API key for your preferred provider:
- Anthropic (𝚍𝚎𝚏𝚊𝚞𝚕𝚝):
export ANTHROPIC_API_KEY="sk-ant-…"
- OpenAI (when running with
--provider openai):
export OPENAI_API_KEY="sk-openai-…"
-
Place a Pokémon Red ROM (
pokemon.gb) in the project root (or point to it with--rom).
The entry‑point is main.py. It both spins up a FastAPI server and starts
the agent. All interaction happens through the web UI – no separate headless
mode is needed.
# Quick start – Anthropic Sonnet playing 1 000 000 steps (~10 weeks), UI on port 3000
python main.py --rom pokemon.gb --steps 1000000
# Use OpenAI o4‑mini instead
python main.py --provider openai --model o4-miniKey flags:
--rom <file.gb>– path to the Pokémon Red ROM (default:pokemon.gb)--steps <N>– maximum steps to execute (agent can be paused / resumed). Default is1_000_000(~30 frames × 10 weeks).--port <N>– port for the FastAPI server / web UI (default 3000)--save-state <file.state>– load a PyBoy save state at startup--overlay– draw walkable‑tile overlay inside the game feed--provider anthropic|openai– choose LLM backend (default: anthropic)--model <name>– override default model for the chosen provider
Open http://localhost:<port> in a browser to see:
- Game Screen – live 30 FPS video
- Assistant Messages – the model’s tool calls & high‑level reasoning
- Context History – compressed conversation so far
- Controls – Run, Pause, Stop, Load Save State
Each run writes to logs/run_<timestamp>/:
frames/: PNG screenshots per stepclaude_messages.log: model response logsgame.log: emulator and agent logs
Inside each run folder you will also find history_saves/ containing periodic
PyBoy .state snapshots. These are written automatically:
- Whenever the agent summarises the running conversation (~every 50 steps).
- Immediately after the player transitions between major areas (e.g. moves to another floor or map).
You can resume from any snapshot by either:
• Supplying --save-state <file> on the command line, or
• Clicking Load Save in the web UI and selecting a .state file.
Global defaults live in config.py:
MODEL_NAME– default Anthropic model (CLI--modeloverrides)TEMPERATURE– sampling temperature passed to the LLMMAX_TOKENS– hard limit for the response sizeUSE_NAVIGATOR– toggle the higher‑levelnavigate_totool (default: True)
PRs welcome! Please open issues or pull requests 😊