Skip to content

Commit 02979fc

Browse files
authored
Merge pull request #14 from Exorust/new-llm
Merge to Main
2 parents 01a7651 + 86aa561 commit 02979fc

File tree

70 files changed

+18460
-375
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

70 files changed

+18460
-375
lines changed

CheatSheet.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
# Cheat Sheet for PyTorch
2+

README.md

Lines changed: 102 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,9 @@
11
<div align="center">
2-
<img src="torch.png" alt="Robot Image">
2+
<img src="torchleet-llm.png" alt="Robot Image">
33
<!-- <h1>TorchLeet</h1> -->
44
<p align="center">
55
🐦 <a href="https://twitter.com/charoori_ai">Follow me on Twitter</a> •
6+
➡️ <a href="https://github.com/Exorust/TorchLeet/tree/new-llm?tab=readme-ov-file#llm-set">Jump to LLMs!</a>
67
📧 <a href="mailto:chandrahas.aroori@gmail.com?subject=LLM%20Cookbook">Feedback</a>
78
</p>
89
<p>
@@ -11,55 +12,118 @@
1112
</div>
1213
<br/>
1314

14-
TorchLeet is a curated set of PyTorch practice problems, inspired by LeetCode-style challenges, designed to enhance your skills in deep learning and PyTorch.
15+
TorchLeet is broken into two sets of questions:
16+
1. **Question Set**: A collection of PyTorch practice problems, ranging from basic to hard, designed to enhance your skills in deep learning and PyTorch.
17+
2. **LLM Set**: A new set of questions focused on understanding and implementing Large Language Models (LLMs) from scratch, including attention mechanisms, embeddings, and more.
18+
3.
19+
> [!NOTE]
20+
> Avoid using GPT. Try to solve these problems on your own. The goal is to learn and understand PyTorch concepts deeply.
1521
1622
## Table of Contents
17-
- [TorchLeet](#torchleet)
18-
- [Table of Contents](#table-of-contents)
19-
- [Question Set](#question-set)
20-
- [🟢Easy](#easy)
21-
- [🟡Medium](#medium)
22-
- [🔴Hard](#hard)
23-
- [Getting Started](#getting-started)
24-
- [1. Install Dependencies](#1-install-dependencies)
25-
- [2. Structure](#2-structure)
26-
- [3. How to Use](#3-how-to-use)
23+
- [Question Set](#question-set)
24+
- [🔵Basic](#basic)
25+
- [🟢Easy](#easy)
26+
- [🟡Medium](#medium)
27+
- [🔴Hard](#hard)
28+
- [LLM Set](#llm-set)
29+
- [Getting Started](#getting-started)
30+
- [1. Install Dependencies](#1-install-dependencies)
31+
- [2. Structure](#2-structure)
32+
- [3. How to Use](#3-how-to-use)
2733
- [Contribution](#contribution)
28-
- [Authors:](#authors)
2934

3035

3136
## Question Set
3237

33-
### 🟢Easy
34-
1. [Implement linear regression](https://github.com/Exorust/TorchLeet/blob/main/e1/lin-regression.ipynb) [(Solution)](https://github.com/Exorust/TorchLeet/blob/main/e1/lin-regression_SOLN.ipynb)
35-
2. [Write a custom Dataset and Dataloader to load from a CSV file](https://github.com/Exorust/TorchLeet/blob/main/e2/custom-dataset.ipynb) [(Solution)](https://github.com/Exorust/TorchLeet/blob/main/e2/custom-dataset_SOLN.ipynb)
36-
3. [Write a custom activation function (Simple)](https://github.com/Exorust/TorchLeet/blob/main/e3/custom-activation.ipynb) [(Solution)](https://github.com/Exorust/TorchLeet/blob/main/e3/custom-activation_SOLN.ipynb)
37-
4. [Implement Custom Loss Function (Huber Loss)](https://github.com/Exorust/TorchLeet/blob/main/e4/custom-loss.ipynb) [(Solution)](https://github.com/Exorust/TorchLeet/blob/main/e4/custom-loss_SOLN.ipynb)
38-
5. [Implement a Deep Neural Network](https://github.com/Exorust/TorchLeet/blob/main/e5/custon-DNN.ipynb) [(Solution)](https://github.com/Exorust/TorchLeet/blob/main/e5/custon-DNN_SOLN.ipynb)
39-
6. [Visualize Training Progress with TensorBoard in PyTorch](https://github.com/Exorust/TorchLeet/blob/main/e6/tensorboard.ipynb) [(Solution)](https://github.com/Exorust/TorchLeet/blob/main/e6/tensorboard_SOLN.ipynb)
40-
7. [Save and Load Your PyTorch Model](https://github.com/Exorust/TorchLeet/blob/main/e7/save_model.ipynb) [(Solution)](https://github.com/Exorust/TorchLeet/blob/main/e7/save_model_SOLN.ipynb)
38+
### 🔵Basic
39+
Mostly for beginners to get started with PyTorch.
40+
41+
1. [Implement linear regression](torch/basic/lin-regression/lin-regression.ipynb) [(Solution)](torch/basic/lin-regression/lin-regression_SOLN.ipynb)
42+
2. [Write a custom Dataset and Dataloader to load from a CSV file](torch/basic/custom-dataset/custom-dataset.ipynb) [(Solution)](torch/basic/custom-dataset/custom-dataset_SOLN.ipynb)
43+
3. [Write a custom activation function (Simple)](torch/easy/e3/custom-activation.ipynb) [(Solution)](torch/easy/e3/custom-activation_SOLN.ipynb)
44+
4. [Implement Custom Loss Function (Huber Loss)](torch/easy/e4/custom-loss.ipynb) [(Solution)](torch/easy/e4/custom-loss_SOLN.ipynb)
45+
5. [Implement a Deep Neural Network](torch/easy/e5/custon-DNN.ipynb) [(Solution)](torch/easy/e5/custon-DNN_SOLN.ipynb)
46+
6. [Visualize Training Progress with TensorBoard in PyTorch](torch/easy/e6/tensorboard.ipynb) [(Solution)](torch/easy/e6/tensorboard_SOLN.ipynb)
47+
7. [Save and Load Your PyTorch Model](torch/easy/e7/save_model.ipynb) [(Solution)](torch/easy/e7/save_model_SOLN.ipynb)
48+
10. Implement Softmax function from scratch
4149

50+
---
51+
52+
### 🟢Easy
53+
Recommended for those who have a basic understanding of PyTorch and want to practice their skills.
54+
1. [Implement a CNN on CIFAR-10](torch/medium/m2/CNN.ipynb) [(Solution)](torch/medium/m2/CNN_SOLN.ipynb)
55+
2. [Implement an RNN from Scratch](torch/medium/m3/RNN.ipynb) [(Solution)](torch/medium/m3/RNN_SOLN.ipynb)
56+
3. [Use `torchvision.transforms` to apply data augmentation](torch/medium/m4/augmentation.ipynb) [(Solution)](torch/medium/m4/augmentation_SOLN.ipynb)
57+
4. [Add a benchmark to your PyTorch code](torch/medium/m5/bench.ipynb) [(Solution)](torch/medium/m5/bench_SOLN.ipynb)
58+
5. [Train an autoencoder for anomaly detection](torch/medium/m6/autoencoder.ipynb) [(Solution)](torch/medium/m6/autoencoder_SOLN.ipynb)
59+
6. [Quantize your language model](torch/hard/h6/quantize-language-model.ipynb) [(Solution)](torch/hard/h6/quantize-language-model_SOLN.ipynb)
60+
7. [Implement Mixed Precision Training using torch.cuda.amp](torch/hard/h9/cuda-amp.ipynb) [(Solution)](torch/hard/h9/cuda-amp_SOLN.ipynb)
61+
62+
---
4263

4364
### 🟡Medium
44-
1. [Implement an LSTM](https://github.com/Exorust/TorchLeet/blob/main/m1/LSTM.ipynb) [(Solution)](https://github.com/Exorust/TorchLeet/blob/main/m1/LSTM_SOLN.ipynb)
45-
2. [Implement a CNN on CIFAR-10](https://github.com/Exorust/TorchLeet/blob/main/m2/CNN.ipynb) [(Solution)](https://github.com/Exorust/TorchLeet/blob/main/m2/CNN_SOLN.ipynb)
46-
3. [Implement parameter initialization for a CNN]() [(Solution)]()
47-
4. [Implement an RNN](https://github.com/Exorust/TorchLeet/blob/main/m3/RNN.ipynb) [(Solution)](https://github.com/Exorust/TorchLeet/blob/main/m3/RNN_SOLN.ipynb)
48-
5. [Use `torchvision.transforms` to apply data augmentation](https://github.com/Exorust/TorchLeet/blob/main/m4/augmentation.ipynb) [(Solution)](https://github.com/Exorust/TorchLeet/blob/main/m4/augmentation_SOLN.ipynb)
49-
6. [Add a benchmark to your PyTorch code](https://github.com/Exorust/TorchLeet/blob/main/m5/bench.ipynb) [(Solution)](https://github.com/Exorust/TorchLeet/blob/main/m5/bench_SOLN.ipynb)
50-
7. [Train an autoencoder for anomaly detection](https://github.com/Exorust/TorchLeet/blob/main/m6/autoencoder.ipynb) [(Solution)](https://github.com/Exorust/TorchLeet/blob/main/m6/autoencoder_SOLN.ipynb)
65+
These problems are designed to challenge your understanding of PyTorch and deep learning concepts. They require you to implement things from scratch or apply advanced techniques.
66+
1. [Implement parameter initialization for a CNN]() [(Solution)]()
67+
2. [Implement a CNN from Scratch]()
68+
3. [Implement an LSTM from Scratch](torch/medium/m1/LSTM.ipynb) [(Solution)](torch/medium/m1/LSTM_SOLN.ipynb)
69+
4. Implement AlexNet from scratch
70+
5. Build a Dense Retrieval System using PyTorch
71+
6. Implement KNN from scratch in PyTorch
72+
73+
---
5174

5275
### 🔴Hard
53-
1. [Write a custom Autograd function for activation (SILU)](https://github.com/Exorust/TorchLeet/blob/main/h1/custom-autgrad-function.ipynb) [(Solution)](https://github.com/Exorust/TorchLeet/blob/main/h1/custom-autgrad-function_SOLN.ipynb)
76+
These problems are for advanced users who want to push their PyTorch skills to the limit. They involve complex architectures, custom layers, and advanced techniques.
77+
1. [Write a custom Autograd function for activation (SILU)](torch/hard/h1/custom-autgrad-function.ipynb) [(Solution)](torch/hard/h1/custom-autgrad-function_SOLN.ipynb)
5478
2. Write a Neural Style Transfer
55-
3. [Write a Transformer](https://github.com/Exorust/TorchLeet/blob/main/h3/transformer.ipynb) [(Solution)](https://github.com/Exorust/TorchLeet/blob/main/h3/transformer_SOLN.ipynb)
56-
4. [Write a GAN](https://github.com/Exorust/TorchLeet/blob/main/h4/GAN.ipynb) [(Solution)](https://github.com/Exorust/TorchLeet/blob/main/h4/GAN_SOLN.ipynb)
57-
5. [Write Sequence-to-Sequence with Attention](https://github.com/Exorust/TorchLeet/blob/main/h5/seq-to-seq-with-Attention.ipynb) [(Solution)](https://github.com/Exorust/TorchLeet/blob/main/h5/seq-to-seq-with-Attention_SOLN.ipynb)
58-
6. [Quantize your language model](https://github.com/Exorust/TorchLeet/blob/main/h6/quantize-language-model.ipynb) [(Solution)](https://github.com/Exorust/TorchLeet/blob/main/h6/quantize-language-model_SOLN.ipynb)
59-
7. [Enable distributed training in pytorch (DistributedDataParallel)]
60-
8. [Work with Sparse Tensors]
61-
9. [Implement Mixed Precision Training using torch.cuda.amp](https://github.com/Exorust/TorchLeet/blob/main/h9/cuda-amp.ipynb) [(Solution)](https://github.com/Exorust/TorchLeet/blob/main/h9/cuda-amp_SOLN.ipynb)
62-
10. [Add GradCam/SHAP to explain the model.](https://github.com/Exorust/TorchLeet/blob/main/h10/xai.ipynb) [(Solution)](https://github.com/Exorust/TorchLeet/blob/main/h10/xai_SOLN.ipynb)
79+
3. Build a Graph Neural Network (GNN) from scratch
80+
4. Build a Graph Convolutional Network (GCN) from scratch
81+
5. [Write a Transformer](torch/hard/h3/transformer.ipynb) [(Solution)](torch/hard/h3/transformer_SOLN.ipynb)
82+
6. [Write a GAN](torch/hard/h4/GAN.ipynb) [(Solution)](torch/hard/h4/GAN_SOLN.ipynb)
83+
7. [Write Sequence-to-Sequence with Attention](torch/hard/h5/seq-to-seq-with-Attention.ipynb) [(Solution)](torch/hard/h5/seq-to-seq-with-Attention_SOLN.ipynb)
84+
8. [Enable distributed training in pytorch (DistributedDataParallel)]
85+
9. [Work with Sparse Tensors]
86+
10. [Add GradCam/SHAP to explain the model.](torch/hard/h10/xai.ipynb) [(Solution)](torch/hard/h10/xai_SOLN.ipynb)
87+
11. Linear Probe on CLIP Features
88+
12. Add Cross Modal Embedding Visualization to CLIP (t-SNE/UMAP)
89+
13. Implement a Vision Transformer
90+
14. Implement a Variational Autoencoder
91+
92+
---
93+
94+
## LLM Set
95+
96+
**An all new set of questions to help you understand and implement Large Language Models from scratch.**
97+
98+
Each question is designed to take you one step closer to building your own LLM.
99+
100+
1. Implement KL Divergence Loss
101+
2. Implement RMS Norm
102+
3. [Implement Byte Pair Encoding from Scratch](llm/Byte-Pair-Encoding/BPE-q3-Question.ipynb) [(Solution)](llm/Byte-Pair-Encoding/BPE-q3.ipynb)
103+
4. Create a RAG Search of Embeddings from a set of Reviews
104+
5. Implement Predictive Prefill with Speculative Decoding
105+
6. [Implement Attention from Scratch](llm/Implement-Attention-from-Scratch/attention-q4-Question.ipynb) [(Solution)](llm/Implement-Attention-from-Scratch/attention-q4.ipynb)
106+
7. [Implement Multi-Head Attention from Scratch](llm/Multi-Head-Attention/multi-head-attention-q5-Question.ipynb) [(Solution)](llm/Multi-Head-Attention/multi-head-attention-q5.ipynb)
107+
8. [Implement Grouped Query Attention from Scratch](llm/Grouped-Query-Attention/grouped-query-attention-Question.ipynb) [(Solution)](llm/Grouped-Query-Attention/grouped-query-attention.ipynb)
108+
9. Implement KV Cache in Multi-Head Attention from Scratch
109+
10. [Implement Sinusoidal Embeddings](llm/Sinusoidal-Positional-Embedding/sinusoidal-q7-Question.ipynb) [(Solution)](llm/Sinusoidal-Positional-Embedding/sinusoidal-q7.ipynb)
110+
11. [Implement ROPE Embeddings](llm/Rotary-Positional-Embedding/rope-q8-Question.ipynb) [(Solution)](llm/Rotary-Positional-Embedding/rope-q8.ipynb)
111+
12. [Implement SmolLM from Scratch](llm/SmolLM/smollm-q12-Question.ipynb) [(Solution)](llm/SmolLM/smollm-q12.ipynb)
112+
13. Implement Quantization of Models
113+
1. GPTQ
114+
14. Implement Beam Search atop LLM for decoding
115+
15. Implement Top K Sampling atop LLM for decoding
116+
16. Implement Top p Sampling atop LLM for decoding
117+
17. Implement Temperature Sampling atop LLM for decoding
118+
18. Implement LoRA on a layer of an LLM
119+
1. QLoRA
120+
19. Mix two models to create a mixture of Experts
121+
20. Apply SFT on SmolLM
122+
21. Apply RLHF on SmolLM
123+
22. Implement DPO based RLHF
124+
23. Add continous batching to your LLM
125+
24. Chunk Textual Data for Dense Passage Retreival
126+
25. Implement Lage scale Training => 5D Parallelism
63127

64128
**What's cool? 🚀**
65129
- **Diverse Questions**: Covers beginner to advanced PyTorch concepts (e.g., tensors, autograd, CNNs, GANs, and more).
@@ -84,10 +148,8 @@ TorchLeet is a curated set of PyTorch practice problems, inspired by LeetCode-st
84148
**Happy Learning! 🚀**
85149

86150

87-
88-
89151
# Contribution
90-
Feel free to contribute by adding new questions or improving existing ones. Ensure that new problems are well-documented and follow the project structure.
152+
Feel free to contribute by adding new questions or improving existing ones. Ensure that new problems are well-documented and follow the project structure. Submit a PR and tag the authors.
91153

92154
# Authors
93155

Tricks.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
# Tricks
2+
3+
List of tricks in pytorch
4+
5+
6+
## TorchScript
7+
@torch.jit.scriptorch.jit.script
8+
9+
## Unit Tests in PyTorch

h3/transformer_SOLN.ipynb

Lines changed: 0 additions & 176 deletions
This file was deleted.

0 commit comments

Comments
 (0)