CS182 Final Project

Instructions on how to setup below. Also, give train.py a read. It will make understanding what's going on a lot easier.

Install dependencies

conda env create -f environment.yml
conda activate 182env

Reproducing GPT-2

Tokenize OpenWebText, an open reproduction of OpenAI's (private) WebText:

python3 data/openwebtext/prepare.py

This downloads and tokenizes the OpenWebText dataset. It will create a train.bin and val.bin which holds the token ids in one sequence, stored as raw uint8 bytes. Then we're ready to kick off training. To reproduce (take a look at train.sh for more detailed and relevant arguments), simply run

bash run_train.sh config/mono_alphabetic_sub.yaml 1 single-gpu 4 logs/test.log
bash run_train.sh config/vigenere.yaml 1 single-gpu 4 logs/test.log

within ~/CS182-project. Note that the default config values pertain to the best model results found in the paper.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
assets		assets
config		config
logs		logs
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
data_sampling_demo.ipynb		data_sampling_demo.ipynb
environment.yml		environment.yml
model.py		model.py
plots.ipynb		plots.ipynb
run_train.sh		run_train.sh
scaling_laws.ipynb		scaling_laws.ipynb
schemes.py		schemes.py
train.py		train.py
transformer_sizing.ipynb		transformer_sizing.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CS182 Final Project

Install dependencies

Reproducing GPT-2

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

adistomar/CS182-project

Folders and files

Latest commit

History

Repository files navigation

CS182 Final Project

Install dependencies

Reproducing GPT-2

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages