This repo is an experimental environment. The goal of this repo is create and train ~50M Language Models and be able to experiment with them using low-resources (both for compute and language data).
We will document everything we do in a diary-like format, so anyone can follow up using those notes along with the code.