Instructions on how to setup below. Also, give train.py a read. It will make understanding what's going on a lot easier.
conda env create -f environment.yml
conda activate 182env
Tokenize OpenWebText, an open reproduction of OpenAI's (private) WebText:
python3 data/openwebtext/prepare.pyThis downloads and tokenizes the OpenWebText dataset. It will create a train.bin and val.bin which holds the token ids in one sequence, stored as raw uint8 bytes. Then we're ready to kick off training. To reproduce (take a look at train.sh for more detailed and relevant arguments), simply run
bash run_train.sh config/mono_alphabetic_sub.yaml 1 single-gpu 4 logs/test.log
bash run_train.sh config/vigenere.yaml 1 single-gpu 4 logs/test.logwithin ~/CS182-project. Note that the default config values pertain to the best model results found in the paper.