Build GPT from Scratch — Inspired by Andrej Karpathy

This project is a from-scratch implementation of GPT, built while following Andrej Karpathy’s tutorials and reading Attention Is All You Need.
The goal was to deeply understand the inner workings of Transformers, Attention mechanisms, and how GPT models are trained and generate text.

What I Learned

Fundamentals of Transformers
The concept and math behind Attention (Self-Attention & Cross-Attention)
Positional Encoding
Multi-Head Attention
Layer Normalization and Residual Connections
Tokenization and embeddings
GPT architecture and forward pass logic
Training loop implementation from scratch

Training & Dataset

Dataset: Shakespeare’s text dataset
Model trained from scratch to generate Shakespeare-style text
Custom implementation of attention layers and Transformer blocks

Key Features

Fully implemented Transformer architecture in Python
Self-attention and cross-attention mechanics coded manually
Tokenizer and vocabulary building from scratch
Ability to train on custom datasets
Text generation from trained model

Example Output

After training on Shakespeare’s dataset, the model can produce text in the style of Shakespeare:

How to Run

git clone https://github.com/preethamak/gpt
cd gpt
python train.py

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
GPT.ipynb		GPT.ipynb
GPT.py		GPT.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Build GPT from Scratch — Inspired by Andrej Karpathy

What I Learned

Training & Dataset

Key Features

Example Output

How to Run

About

Uh oh!

Releases

Packages

Languages

preethamak/GPT

Folders and files

Latest commit

History

Repository files navigation

Build GPT from Scratch — Inspired by Andrej Karpathy

What I Learned

Training & Dataset

Key Features

Example Output

How to Run

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages