Skip to content

nebHailemariam/NebTorch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NebTorch

NebTorch is a minimal Autograd engine built from scratch using NumPy, inspired by PyTorch’s automatic differentiation system.

In 11-785: Introduction to Deep Learning, a graduate-level course at CMU taught by Prof. Bhiksha Raj Ramakrishnan, I completed a sequence of assignments covering everything from foundational concepts to advanced topics in Deep Learning — including neural networks, optimizations, and more. The course provided both theoretical and practical understanding of neural networks, along with a brief introduction to Autograd.

After completing the course, I was inspired to dive deeper and build my own Autograd engine from scratch. Building NebTorch has been very rewarding—I’ve solidified my understanding of Deep Learning and Automatic Differentiation, and most of all, I’ve gained appreciation for frameworks such as PyTorch and TensorFlow.

Most of the course content is openly available online: https://deeplearning.cs.cmu.edu/F24/index.html

Quick Start Example

Here's a complete example demonstrating how to use NebTorch to train a simple Multi-Layer Perceptron (MLP) on the Iris dataset:

Imports

import numpy as np
from nebtorch import Module, Tensor
from nebtorch.nn import Linear, ReLU, CrossEntropyLoss, Softmax
from nebtorch.optim import SGD
from sklearn import datasets
from sklearn.model_selection import train_test_split

Define the Model

class MLP(Module):
    def __init__(self, in_features: int, out_features: int):
        super().__init__()
        self.linear_1 = Linear(in_features=in_features, out_features=256)
        self.act = ReLU()
        self.linear_2 = Linear(in_features=256, out_features=out_features)

    def forward(self, input: Tensor):
        out = self.linear_1(input)
        out = self.act(out)
        logits = self.linear_2(out)
        return logits

Data Preparation

# Load and prepare data
iris = datasets.load_iris()
X = iris.data
Y = iris.target

X_train, X_test, Y_train, Y_test = train_test_split(
    X, Y, test_size=0.2, random_state=42
)

# Convert to NebTorch tensors
X_train = nebtorch.tensor(X_train)
Y_train = nebtorch.tensor(Y_train)
X_test = nebtorch.tensor(X_test)
Y_test = nebtorch.tensor(Y_test)

Model Setup

# Hyperparameters
INPUT_FEATURES = X_train.shape[1]
NUM_CLASSES = np.max(Y) + 1
EPOCHS = 100
BATCH_SIZE = 5

# Initialize model, loss, and optimizer
model = MLP(INPUT_FEATURES, NUM_CLASSES)
criterion = CrossEntropyLoss()
optimizer = SGD(model.parameters(), lr=0.01)

Training Loop

num_batches = X_train.shape[0] // BATCH_SIZE

for epoch in range(EPOCHS):
    for i in range(num_batches):
        model.train()
        optimizer.zero_grad()

        # Get batch
        start_idx = i * BATCH_SIZE
        end_idx = start_idx + BATCH_SIZE
        input = X_train[start_idx:end_idx]
        target = Y_train[start_idx:end_idx]

        # Forward pass
        out = model(input)
        loss = criterion(out, target)

        # Backward pass
        loss.backward()
        optimizer.step()

    # Print progress
    if epoch % 10 == 0:
        print(f"Epoch {epoch:3d} | Loss: {loss.data.item():.4f}")

Evaluation

# Evaluate on test set
model.eval()
out = model(X_test)
loss = criterion(out, Y_test)

# Calculate accuracy
softmax = Softmax(dim=1)
predictions = np.argmax(softmax(out).data, axis=1)
accuracy = np.sum(predictions == Y_test.data) / Y_test.shape[0] * 100
print(f"Test Accuracy: {accuracy:.2f}%")

Implementation Overview

Base Classes

Component Description
Module Base class for all neural network modules
Tensor Multi-dimensional datastructure with automatic differentiation support
Parameter Special tensor for trainable model parameters
Optimizer Base class for all optimizers

Tensor Operations

Component Description
Add Element-wise addition with broadcasting
Subtract Element-wise subtraction with broadcasting
Negate Element-wise negation
Multiply Element-wise multiplication with broadcasting
Divide Element-wise division with broadcasting
Matrix Multiplication Matrix multiplication (@ operator)
Transpose Matrix transposition
Reshape Tensor reshaping
Log Natural logarithm
Exp Exponential function
Power Element-wise power operation
Mean Mean reduction with axis support
Variance Variance reduction with axis support
Sum Sum reduction with axis support
Max Maximum reduction with axis support
Slice Tensor indexing and slicing

Activation Functions

Component Description
Sigmoid Sigmoid activation function
Tanh Hyperbolic tangent activation
ReLU Rectified Linear Unit
GELU Gaussian Error Linear Unit
Softmax Softmax with dimension support

Neural Network Layers

Component Description
Linear Fully connected layer
Conv1d_stride1 1D convolution with stride 1
Conv2d_stride1 2D convolution with stride 1
Conv2d 2D convolution with configurable stride
MaxPool2d_stride1 2D max pooling with stride 1
MeanPool2d_stride1 2D mean pooling with stride 1
MaxPool2d 2D max pooling with configurable stride
MeanPool2d 2D mean pooling with configurable stride
BatchNorm1d 1D batch normalization
LayerNorm Layer normalization
Dropout Dropout regularization
Embedding Embedding layer for sparse inputs

Recurrent Neural Networks

Component Description
RNNCell Recurrent neural network cell
GRUCell Gated Recurrent Unit cell

Upsampling/Downsampling

Component Description
Upsampling1d 1D upsampling
Downsample1d 1D downsampling
Upsample2d 2D upsampling
Downsample2d 2D downsampling

Attention Mechanisms

Component Description
MultiheadAttention Multi-head attention mechanism
Scaled Dot-Product Attention Scaled dot-product attention

Loss Functions

Component Description
Loss Base class for all loss functions
MSELoss Mean Squared Error loss
CrossEntropyLoss Cross-entropy loss with softmax

Optimizers

Component Description
SGD Stochastic Gradient Descent

References

About

NebTorch—a minimal Autograd engine built from scratch using NumPy

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages