This repository provides an implementation of learning-based Monte-Carlo Tree Search variants in the Pommerman environment. Our approaches leverage opponent models (planning agents) to transform the multiplayer game into single- and two-player games depending on the provided settings.
The simplest way to get started and execute runs is to build a docker image and run it as a container.
Available backends:
- TensorRT (NVIDIA GPU required): Tested with TensorRT 8.0.1 and PyTorch 1.9.0.
To use NVIDIA GPUs in docker containers, you have to install docker and nvidia-docker2. Have a look at the installation guide https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html.
We provide small scripts to facilitate building the image and running experiments.
-
Build the image
$ bash docker/build.shThis automatically caches the dependencies. If you run it again, only the code is rebuilt. If you want to rebuild the whole image, just call
bash docker/build.sh --no-cache. -
Specify where you want to store the data generated by the experiments as environment variable
$POMMER_DATA_DIR. You canexport POMMER_DATA_DIR=/some/diror just addPOMMER_DATA_DIR=/some/diras a prefix to the command in the following step. -
Create a container and run the training loop (replace
--helpwith the arguments of your choice)$ bash docker/run.sh --help- Note that
--dirand--execare already specified correctly bydocker/run.sh. - All GPUs are visible in the container and gpu 0 is used by default. You can specify the gpu to be used like
--gpu 4.
- Note that
Of course, you can also build and run the image manually. Have a closer look at the scripts from the previous section for details.
Additional notes:
- You can limit the gpu access of a container like
--gpus device=4. However, PommerLearn has a--gpuargument that can be used instead. - Warning: If you use rootless docker, the container will probably run out of memory.
Adding
--ipc=hostor--shm-size=32gto thedocker runcommand helps. This is also done by default indocker/run.sh.
-
Generate an SL dataset with 1 million samples with
$POMMER_EXEC --mode=ffa_sl --max-games=-1 --chunk-size=1000 --chunk-count=1000 --log --file-prefix=./1M_simplewhere
$POMMER_EXECcan be yourPommerLearnexecutable orMODE=exec bash docker/run.sh -
Train SL model: Run
pommerlearn/training/train_cnn.pywith the following modified arguments (see bottom of the file)"dataset_path": "1M_simple_0.zr", "test_size": 0.01, "output_dir": "./model-sl"and save it as
$POMMER_DATA_DIR/model-sl -
Generate a dummy model by running
pommerlearn/debug/create_dummy_model.pyand save it as$POMMER_DATA_DIR/model-dummy -
You can now perform search experiments with both models. Use
POMMER_1VS1=false MODE=exec bash run.shfor the single-player search andPOMMER_1VS1=true MODE=exec bash run.shgit for the two-player search. -
To reproduce our results, you can generate 5 sl and dummy models labeled with the respective suffix
-0to-4. Navigate into the docker directory and run the search experiments with./docker $ bash search_experiments.shThe results will be recorded in a single csv file.
Navigate into the docker directory and run the rl experiments with
./docker $ bash rl_experiments.sh
This will create a new directory in your working directory to store the training logs.
You will find the results in your $POMMER_DATA_DIR/archive and the tensorboard runs in $POMMER_DATA_DIR/runs.
To perform experiments in the team mode, you can collect samples with the option --mode=team_sl and otherwise proceed like in the FFA mode, e.g.
```
$POMMER_EXEC --mode=team_sl --max-games=-1 --chunk-size=1000 --chunk-count=1000 --log --file-prefix=./1M_simple_team
```
You can then run pommerlearn/training/train_cnn.py on the generated data set, this will automatically use the value targets for the team mode due to the meta information in the data set.
For the python side:
-
python 3.7andpipIt is recommend to use virtual environments. This guide will use Anaconda. Create an environment named
pommerwith$ conda create -n pommer python=3.7
For the C++ side:
-
Essential build tools:
gcc,make,cmake$ sudo apt install build-essential cmake -
The dependencies z5, xtensor, boost and json by nlohmann can directly be installed with conda in the pommer environment:
(pommer) $ conda install -c conda-forge z5py xtensor boost nlohmann_json blosc -
Blaze needs to be installed manually. Note that it can be unpacked anywhere, it does not have to be
/usr/local. For further information, you can refer to the installation guide or the Dockerfiles in this repository.cmake -DCMAKE_INSTALL_PREFIX=/usr/local/ sudo make install export BLAZE_PATH=/usr/local/include/ -
Manual installation of TensorRT (not Torch-TensorRT), including CUDA and cuDNN. Please refer to the installation guide by NVIDIA https://developer.nvidia.com/tensorrt-getting-started.
This repository depends on submodules. Clone it and initialize all submodules with
$ git clone git@gitlab.com:jweil/PommerLearn.git && \
$ cd PommerLearn && \
$ git submodule update --init
-
The current version requires you to set the env variables
CONDA_ENV_PATH: path of your conda environment (e.g.~/conda/envs/pommer)BLAZE_PATH: blaze installation path (e.g./usr/local/include)CUDA_PATH: cuda installation path (e.g./usr/local/cuda)TENSORRT_PATH(when using the CrazyAra TensorRT backend, e.g./usr/src/tensorrt)- [
Torch_DIR] (when using the CrazyAra Torch backend, currently untested)
-
Build the C++ environment with the provided
CMakeLists.txt. To use TensorRT >= 8 (recommended), you have to specify-DUSE_TENSORRT8=ON.
/PommerLearn/build $ cmake -DCMAKE_BUILD_TYPE=Release -DUSE_TENSORRT8=ON -DCMAKE_CXX_COMPILER="$(which g++)" ..
/PommerLearn/build $ make VERBOSE=1 all -j8
Optional: You can install PyTorch 1.9.0 with GPU support via
conda install -y pytorch==1.9.0 torchvision==0.10.0 torchaudio==0.9.0 cudatoolkit=11.1 -c conda-forge -c pytorch
The remaining python runtime dependencies can be installed with
(pommer) $ pip install -r requirements.txt
Before starting the RL loop, you can check whether everything is set up correctly by creating a dummy model and loading it in the cpp executable:
(pommer) /PommerLearn/build $ python ../pommerlearn/debug/create_dummy_model.py
(pommer) /PommerLearn/build $ ./PommerLearn --mode=ffa_mcts --model=./model/onnx
You can then start training by running
(pommer) /PommerLearn/build $ python ../pommerlearn/training/rl_loop.py
Prerequisites and Building
- Make sure that you've pulled all submodules recursively
- In older versions of TensorRT, you have to manually comment out
using namespace sample;indeps/CrazyAra/engine/src/nn/tensorrtapi.cpp - We experienced issues with
std::filesystembeing undefined when using GCC 7.5.0. We recommend to update to more recent versions, e.g. GCC 11.2.0.
Running
- For runtime issues like
libstdc++.so.6: version 'GLIBCXX_3.4.30' not found, try loading your system libraries withexport LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/x86_64-linux-gnu/. On some systems, ctypes somehow uses a different libstdc++ from the conda environment instead of the correct lib path. As a last resort, you can back up the original librarymv /conda-lib-path/libstdc++.so.6 /conda-lib-path/libstdc++.so.6.oldand then create a symbolic linkln -s /usr/lib/x86_64-linux-gnu/libstdc++.so.6 /conda-lib-path/libstdc++.so.6. - If you encounter errors like
ModuleNotFoundError: No module named 'training', set yourPYTHONPATHto thepommerlearndirectory. For example,export PYTHONPATH=/PommerLearn/pommerlearn. - When loading
tensorboardruns, you can get errors likeError: tonic::transport::Error(Transport, hyper::Error(Accept, Os { code: 24, kind: Other, message: "Too many open files" })). The argument--load_fast=falsemight help.
You can install the plotting utility for gprof: https://github.com/jrfonseca/gprof2dot
Activate the CMake option USE_PROFILING in CMakeLists.txt and rebuild.
Run the executable and generate the plot:
./PommerLearn --mode ffa_mcts --max_games 10
gprof PommerLearn | gprof2dot | dot -Tpng -o profile.pngIf you find this repository helpful, please consider citing our paper
@inproceedings{weil2023knowYourEnemy,
author={Weil, Jannis and Czech, Johannes and Meuser, Tobias and Kersting, Kristian},
title={{Know your Enemy: Investigating Monte-Carlo Tree Search with Opponent Models in Pommerman}},
booktitle={Proceedings of the Adaptive and Learning Agents Workshop (ALA) at AAMAS 2023},
url={https://alaworkshop2023.github.io/},
year={2023}
}