Input Challenge: Programming and Machine Learning

Credit: RL section is adapted from tetris-ai.

Project Overview

Train a bot to play Tetris using deep reinforcement learning (RL) or imitation learning (IL).
Play Tetris using trained AI or as a human.
Collect demonstration data from both human and RL agents to improve imitation learning.

Project Structure

./
├── assets/         # Images, gifs, diagrams
├── models/         # Saved Keras models
├── data/           # Collected demonstration data (human and RL)
├── src/            # Scripts for training and inference
│   ├── run.py              # Main training script for DQN agent
│   ├── run_model.py        # Script to run inference, collect RL demos, or test imitation policy
│   ├── behav_clone.py      # Train a policy network from demonstration data
│   ├── play_human.py       # Play Tetris as a human and collect data
│   ├── play_human_vs_ai.py # Play Tetris: human vs AI
│   ├── tetris.py           # Tetris game logic (used by scripts)
│   ├── dqn_agent.py        # DQN agent implementation
│   └── logs.py             # Custom logging utilities
├── tetris-ai/      # Original code from tetris-ai
├── logs/           # Training and evaluation logs
├── environment.yml # Conda environment file
├── requirements.txt
├── LICENSE
└── README.md

Setup

If you are using Windows or Linux devices, please follow this link to setup the virtual environment.

If you are using M1 Macbooks, you may find information contained in this repo useful to setup your virtual environment.

After, please follow the steps below to setup your virtual environment.

Clone the repository
Create the conda environment:
```
conda env create -f environment.yml
```

Demo

First 10000 points, after some training.

Usage

1. Imitation Learning

We will use imitation learning to train an AI player. This AI player uses behavior cloning to mimic demonstrations from human players (you and your friends). To train this AI player, we need:

Enter the human play mode use your controller to demonstrate your strategies for playing Tetris. This will collect a set of demonstrations as the training dataset.
```
python src/play_human.py
```

Each session is saved as a separate file in the data/ directory (e.g., human_demo_YYYYMMDD_HHMMSS.npy).
Play multiple games to collect more data.
To ensure your controller is suitable to play Tetris implemented in play_human.py, we need to re-program the buttons on the controller. Add the following line

buttons.append(setup_button(board.GP5, Keycode.DOWN_ARROW))

The down arrow is used to speed up the dropping effect of blocks.

Train an imitation policy from human demo.

python src/behav_clone.py

This script loads all human_demo_*.npy files from data/ and trains a policy network.
The trained policy is saved in models/policy_bc.keras.

Test the trained policy.

python src/run_model.py models/policy_bc.keras

The script auto-detects the model type and runs in the appropriate mode. Visualize the performance of your trained AI player.

2. Reinforcement Learning:

At first, the agent will play random moves, saving the states and the given reward in a limited queue (replay memory). At the end of each episode (game), the agent will train itself (using a neural network) with a random sample of the replay memory. As more and more games are played, the agent becomes smarter, achieving higher and higher scores.

Since in reinforcement learning once an agent discovers a good 'path' it will stick with it, it was also considered an exploration variable (that decreases over time), so that the agent picks sometimes a random action instead of the one it considers the best. This way, it can discover new 'paths' to achieve higher scores.

State and Action Format

State: [lines_cleared, holes, total_bumpiness, sum_height] (4 features from the board)
Action: Integer encoding:
- 0 = left
- 1 = right
- 2 = down
- 3 = rotate
RL data is automatically converted to this format for imitation learning.

Training

The training is based on the Q Learning algorithm for RL, and on supervised learning for imitation. To train the RL agent, run

python src/run.py

(Hyperparameters can be changed in src/run.py.)

Once training is completed, you can test the performance of RL agent using

python src/run_model.py models/best.keras

In RL mode, the script will save RL agent demonstrations in data/ (e.g., rl_demo_YYYYMMDD_HHMMSS.npy). The data generated by RL agent can be used further by imitation learning (See Imitation Learning section).
You can interrupt with Ctrl+C to save partial data.

Environment

Main dependencies (see environment.yml for full list):

python 3.10
tensorflow
keras
numpy
tqdm
matplotlib
scikit-learn
opencv-python

Tips and Troubleshooting

Data Quality: The more diverse and skillful your demonstrations, the better your imitation policy will be.
Ctrl+C: You can safely interrupt data collection or RL runs with Ctrl+C; data will be saved.
Action Format: All actions are converted to integers for training, even if RL data originally used tuples.

Useful Links

Deep Q Learning

PythonProgramming - https://pythonprogramming.net/q-learning-reinforcement-learning-python-tutorial/
Keon - https://keon.io/deep-q-learning/
Towards Data Science - https://towardsdatascience.com/self-learning-ai-agents-part-ii-deep-q-learning-b5ac60c3f47

Tetris

Code My Road - https://codemyroad.wordpress.com/2013/04/14/tetris-ai-the-near-perfect-player/ (uses evolutionary strategies)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Input Challenge: Programming and Machine Learning

Project Overview

Project Structure

Setup

Demo

Usage

1. Imitation Learning

2. Reinforcement Learning:

State and Action Format

Training

Environment

Tips and Troubleshooting

Useful Links

Deep Q Learning

Tetris

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
assets		assets
data		data
logs		logs
models		models
src		src
tetris-ai		tetris-ai
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
requirements.txt		requirements.txt

License

GIXLabs/input_challenge_ML

Folders and files

Latest commit

History

Repository files navigation

Input Challenge: Programming and Machine Learning

Project Overview

Project Structure

Setup

Demo

Usage

1. Imitation Learning

2. Reinforcement Learning:

State and Action Format

Training

Environment

Tips and Troubleshooting

Useful Links

Deep Q Learning

Tetris

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages