3D Go Variant – AlphaZero + Heuristic AIs

A 3D generalization of the classic Go game, featuring an interactive pygame-based 3D board visualization and an AI opponent.

Status: Experimental / research project – great for playing, visualizing, and tinkering with game AI.

Features

3D Board: Play Go on a 3D grid (default 5×5×5)
Interactive 3D View: Rotate, zoom, and navigate through layers with perspective projection
Territory Visualization: See which areas are controlled by each player
Multiple AI Types: Play against heuristic AI, neural AI (REINFORCE), or AlphaZero-style AI (MCTS + ResNet)
Real-time Scoring: Track stones and territory control
AI vs AI Mode: Watch two AIs play against each other step-by-step

Start Screen & Modes

When you launch python main.py, a start screen lets you configure:

Human vs AI: Play against a computer opponent
Human vs Friend: Two-player local game
AI vs AI: Watch two AIs play (step through moves with Space)
Who plays Black (goes first)
AI Type (Heuristic, Neural, or AlphaZero) when playing against the AI
Which AI weight file to load (.json for Neural, .pth for AlphaZero, placed in models/)

In AI vs AI mode, you can select different AI types and weight files for each player.

Use ↑/↓ to navigate, ←/→ to change settings, and Enter to begin.

Controls

Camera & Navigation

A / D or Left Arrow / Right Arrow: Rotate board horizontally (yaw)
W / S or Up Arrow / Down Arrow: Tilt board up/down (pitch)
+ / - or Mouse Wheel: Zoom in/out
Page Up / Page Down or [ / ] or 9 / 0: Change active layer

Gameplay

Left Click: Place a stone on the active layer
P or Space: Pass turn
G: Toggle territory visualization (shows which areas favor black/white with vibrant colors)
H: Toggle grid visibility on/off
T: Cycle through grid display modes (Original Grid / Z-Y Axes)
F: Toggle axis helper (shows X/Y/Z axes with numeric labels)
J: Toggle stone pillars (depth cues to show stone height)
O: Toggle perspective transform (ON: depth-based scaling, OFF: orthographic projection)

Other

Q or Esc: Quit game

AI Training

Neural AI Training (REINFORCE)

Train a neural AI using REINFORCE against the heuristic baseline:

python neural_training.py --board-size 5 --episodes 100 --save-path models/neural_ai.json

This script:

Plays the neural policy against the heuristic AI
Logs per-episode rewards, capture differentials, and running win-rate summaries (--log-interval)
Uses reward & penalty functions based on win/loss, territory, and captures
Updates a simple Elo rating (auto-saved to {save_path}_elo.json)
Runs 100 evaluation games after training completes
Saves updated weights that can be selected from the start screen

Weight Naming: You can name your trained models by specifying a custom --save-path. For example:

python neural_training.py --save-path models/my_neural_v1.json --episodes 200

Options:

--log-interval: how often to print win-rate/reward summaries (default 10)
--load-path: resume from saved weights
--elo-path: set custom Elo tracking file (default: derived from save-path)
--profile: enable detailed timing per episode

AlphaZero Training (Self-Play)

Train an AlphaZero-style AI using self-play with MCTS:

python alpha_zero_training.py --num-games 1000 --num-simulations 50 --save-path models/alpha_zero.pth

This implements a full AlphaZero architecture:

3D ResNet Network: Convolutional neural network with residual blocks
- Input: [batch, 3, 5, 5, 5] (black, white, empty channels)
- Output: Policy [batch, 126] (125 positions + pass), Value [batch, 1]
MCTS with PUCT: Monte Carlo Tree Search using PUCT algorithm
- Dirichlet noise at root for exploration
- Configurable simulations per move (default: 50)
- Temperature schedule (τ=1 early, τ→0 later)
Self-Play Training: Games stored as (state, π, z) tuples
- Replay buffer for experience replay
- Loss: (z - v)² - πᵀ log(p) + L2 regularization
Curriculum Learning: Starts training against heuristic baseline, then switches to self-play once win rate exceeds threshold (default: 55%)
Elo Evaluation: Automatically runs 100 evaluation games against baseline after training, calculates Elo ratings

Options:

--num-games: Number of self-play games (default: 1000)
--num-simulations: MCTS simulations per move (default: 50)
--num-residual-blocks: ResNet blocks (default: 5)
--channels: Network channels (default: 64)
--learning-rate: Learning rate (default: 0.001)
--batch-size: Training batch size (default: 32)
--train-interval: Train every N games (default: 10)
--save-path: Path to save weights (can include custom name, e.g., models/my_alphazero_v1.pth)
--load-path: Path to load weights from
--no-curriculum: Disable curriculum learning (start with self-play immediately)
--win-rate-threshold: Win rate threshold to switch to self-play (default: 0.55)
--eval-games: Number of games to evaluate before checking win rate (default: 20)
--elo-path: Custom Elo file path (default: derived from save-path with _elo.json suffix)

Weight Naming: You can name your trained models by specifying a custom --save-path. For example:

python alpha_zero_training.py --save-path models/alphazero_v2.pth --num-games 500

This creates:

models/alphazero_v2.pth (network weights)
models/alphazero_v2_elo.json (Elo ratings)

Trained AlphaZero models can be selected from the start screen when choosing "AlphaZero" as the AI type.

Game Rules

Board: 3D grid where each cell can be empty (0), black stone (1), or white stone (-1)
Adjacency: Stones are connected via 6 orthogonal neighbors (x±1, y±1, z±1)
Groups: Connected stones of the same color form a group
Liberties: Empty cells adjacent to a group
Capture: Groups with no liberties are captured and removed
Suicide Rule: You cannot place a stone that would create a group with no liberties unless it captures opponent stones
Game End: Game ends when both players pass consecutively
Scoring: Based on stones on board plus estimated territory control

Territory Visualization

Press G to toggle territory view. When enabled:

Bright Blue: Areas where Black has positional advantage (vibrant, easy to see)
Bright Pink: Areas where White has positional advantage (vibrant, easy to see)
Territory is calculated based on proximity to stones (closer stones have more influence)
Colors are now much more obvious and visible compared to the subtle tints

Grid Display Modes

Press H to toggle grid visibility, and T to cycle through grid modes:

Mode 0 - X-Y Planes:
- Gray grids show complete X-Y grid patterns for all Z layers
- Green grid highlights the active Z layer's X-Y grid
- Use Page Up/Down to change active Z layer
Mode 1 - Z-Y Planes:
- Gray grids show complete Z-Y grid patterns for all X positions
- Green grid highlights the active Y position's Z-Y grid
- Use Page Up/Down to change active Y position
- When switching modes, the active selection automatically changes (Z layer ↔ Y position)
Axis Helper: Press F to display colored X (red), Y (green), and Z (blue) axes with numeric labels for easier orientation
Stone Pillars: Press J to toggle per-stone depth pillars that connect stones to the base layer for height cues

Installation

pip install -r requirements.txt

Running

python main.py

Command Line Options

--size N: Set board size (default: 5)
--depth N: AI search depth (default: 1)
--samples N: AI move samples per ply (default: 60)
--ai-color 1|-1: AI color (1=black, -1=white, default: -1)

Example:

python main.py --size 5 --depth 2 --samples 100

Project Structure

Core Game

game_state.py: Game rules, board state, group/liberty logic, territory estimation, Ko/Superko rules
ai.py: Heuristic AI move selection with shallow search
view.py: 3D rendering, camera controls, input handling, perspective projection
main.py: Main game loop and mode selection
start_screen.py: Start menu for game configuration

Neural AI (REINFORCE)

neural_ai.py: Policy network using PyTorch (MLP architecture)
neural_training.py: Training script using REINFORCE algorithm

AlphaZero AI

resnet3d_network.py: 3D ResNet architecture for policy and value estimation
mcts.py: Monte Carlo Tree Search with PUCT algorithm
alpha_zero_ai.py: AlphaZero AI wrapper combining ResNet3D and MCTS
alpha_zero_training.py: Self-play training loop with replay buffer, curriculum learning, and Elo evaluation

Development Notes

Uses simple 3D projection (yaw/pitch camera) for rendering with optional perspective transform
Multiple AI architectures: Heuristic (rule-based), Neural (REINFORCE), and AlphaZero (MCTS + ResNet)
Territory estimation uses distance-based influence calculation
Ko and Superko rules implemented to prevent infinite game cycles
Optimized for modest hardware with configurable board sizes
GPU acceleration available for neural network training (auto-detects CUDA)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

3D Go Variant – AlphaZero + Heuristic AIs

Features

Start Screen & Modes

Controls

Camera & Navigation

Gameplay

Other

AI Training

Neural AI Training (REINFORCE)

AlphaZero Training (Self-Play)

Game Rules

Territory Visualization

Grid Display Modes

Installation

Running

Command Line Options

Project Structure

Core Game

Neural AI (REINFORCE)

AlphaZero AI

Development Notes

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
img		img
models		models
LICENSE		LICENSE
README.md		README.md
ai.py		ai.py
alpha_zero_ai.py		alpha_zero_ai.py
alpha_zero_logic.md		alpha_zero_logic.md
alpha_zero_training.py		alpha_zero_training.py
game_state.py		game_state.py
main.py		main.py
mcts.py		mcts.py
neural_ai.py		neural_ai.py
neural_training.py		neural_training.py
requirements.txt		requirements.txt
resnet3d_network.py		resnet3d_network.py
start_screen.py		start_screen.py
test_performance.py		test_performance.py
view.py		view.py

License

Jason-Hoford/3D-GO

Folders and files

Latest commit

History

Repository files navigation

3D Go Variant – AlphaZero + Heuristic AIs

Features

Start Screen & Modes

Controls

Camera & Navigation

Gameplay

Other

AI Training

Neural AI Training (REINFORCE)

AlphaZero Training (Self-Play)

Game Rules

Territory Visualization

Grid Display Modes

Installation

Running

Command Line Options

Project Structure

Core Game

Neural AI (REINFORCE)

AlphaZero AI

Development Notes

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages