iax - A Hyper-Simple Neural Network Engine

A minimal, easy-to-use neural network library for Rust with optional GPU acceleration. Designed for simplicity and performance.

Features

Simple API: Builder pattern for intuitive network construction
Backend Agnostic: Generic over CPU/GPU backends - switch hardware without changing code
GPU Acceleration: Optional wgpu backend (Vulkan, Metal, DX12, WebGPU) for 10-100x speedups
Zero-Copy Training: Backend owns all memory, eliminating CPU-GPU copies in hot paths
Flexible I/O: Trait-based input/output for type-safe data handling
Custom Loss Functions: Transformer trait for domain-specific loss computation
Batched GPU Training: Automatic command batching for maximum GPU utilization

Quick Start

Basic Usage

use iax::prelude::*;

// 1. Define network architecture
let shape = NeuralShape::input(784)
    .layer(256, Activation::ReLU)
    .layer(128, Activation::ReLU)
    .output(10, Activation::Softmax);

// 2. Create network
let mut nn = NeuralNetwork::new(shape);

// 3. Prepare training data
let data = TrainData::from_pairs(&[
    (&[0.0, 0.0], &[0.0]),
    (&[0.0, 1.0], &[1.0]),
    (&[1.0, 0.0], &[1.0]),
    (&[1.0, 1.0], &[0.0]),
]);

// 4. Configure and train
let config = TrainConfig::new()
    .learning_rate(0.01)
    .epochs(100)
    .batch_size(32)
    .momentum(0.9)
    .verbose(10);

let history = nn.train(&data, config);

// 5. Make predictions
let output = nn.predict(&[0.0, 1.0]);

GPU Acceleration

Enable GPU support with the gpu feature:

[dependencies]
iax = { version = "0.1", features = ["gpu"] }

Then use the GPU backend:

use iax::prelude::*;
use iax::gpu::WgpuBackend;

// Initialize GPU backend
if let Some(backend) = WgpuBackend::new() {
    let mut nn = NeuralNetwork::with_backend(shape, backend);
    
    // Use train_batched for optimal GPU performance
    use iax::train::Loss;
    let history = nn.train_batched(&data, &config, &Loss::CrossEntropy);
}

GPU Performance: Expect 10-100x speedups on large networks. Small networks (<100 neurons) may be slower due to GPU overhead.

Installation

Add to your Cargo.toml:

[dependencies]
iax = "0.1"

# Optional: GPU acceleration
[features]
default = []
gpu = ["iax/gpu"]

GPU Requirements

Windows: DirectX 12 compatible GPU
macOS: Metal compatible GPU
Linux: Vulkan drivers (e.g., Mesa for Intel/AMD, NVIDIA proprietary drivers)
Web: WebGPU compatible browser

Network Architecture

Building Networks

Use the builder pattern to define your network:

let shape = NeuralShape::input(784)           // Input layer: 784 features
    .layer(256, Activation::ReLU)              // Hidden layer 1: 256 neurons, ReLU
    .layer(128, Activation::ReLU)              // Hidden layer 2: 128 neurons, ReLU
    .output(10, Activation::Softmax);         // Output layer: 10 classes, Softmax

Available Activations

Activation::Linear - Identity function
Activation::ReLU - Rectified Linear Unit
Activation::LeakyReLU(alpha) - Leaky ReLU with configurable alpha
Activation::Sigmoid - Sigmoid function
Activation::Tanh - Hyperbolic tangent
Activation::Softmax - Softmax (for classification output layers)

Training

Training Configuration

let config = TrainConfig::new()
    .learning_rate(0.01)        // Learning rate (default: 0.001)
    .epochs(100)                // Number of training epochs
    .batch_size(32)             // Mini-batch size (default: full batch)
    .full_batch()               // Use all samples per update
    .momentum(0.9)              // Momentum coefficient (0.0 = no momentum)
    .l2(0.0001)                 // L2 regularization strength
    .validation_split(0.1)      // Fraction of data for validation
    .verbose(10);               // Print progress every N epochs

Training Methods

CPU Training (standard):

let history = nn.train(&data, config);                    // MSE loss
let history = nn.train_classifier(&data, config);         // Cross-entropy loss
let history = nn.train_with_loss(&data, config, Loss::MSE); // Custom loss

GPU Batched Training (optimal for GPU):

use iax::train::Loss;
let history = nn.train_batched(&data, &config, &Loss::CrossEntropy);

With Custom Transformer:

let transformer = MyCustomTransformer;
let history = nn.train_with_transformer(&data, config, &transformer);

Loss Functions

Loss::MSE - Mean Squared Error (for regression)
Loss::CrossEntropy - Cross-Entropy (for classification with softmax)
Loss::BinaryCrossEntropy - Binary Cross-Entropy (for binary classification)

Data Preparation

From Pairs

let data = TrainData::from_pairs(&[
    (&[0.0, 0.0], &[0.0]),
    (&[0.0, 1.0], &[1.0]),
    (&[1.0, 0.0], &[1.0]),
    (&[1.0, 1.0], &[0.0]),
]);

From Flattened Arrays

let inputs = vec![0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0, 1.0];  // Flattened
let targets = vec![0.0, 1.0, 1.0, 0.0];                     // Flattened
let data = TrainData::new(inputs, targets, 2, 1);           // (inputs, targets, input_size, output_size)

Loading from Files

See examples/mnist_csv.rs for CSV loading examples.

GPU-Specific Features

Batch Size Estimation

GPU backends can estimate maximum batch size based on available memory:

if let Some(backend) = WgpuBackend::new() {
    let max_batch = backend.estimate_max_batch_size(&shape);
    println!("Maximum batch size: {}", max_batch);
    
    let limits = backend.limits();
    println!("GPU buffer limit: {}MB", limits.max_buffer_size / 1024 / 1024);
}

Command Batching

GPU operations are automatically batched for optimal performance. The backend accumulates multiple operations into a single command buffer, reducing CPU-GPU synchronization overhead.

Examples

Basic XOR Example

cargo run --example xor

Demonstrates training a simple network to learn the XOR function.

MNIST Classification

Synthetic Data:

cargo run --example mnist --features gpu

Real CSV Data:

# First, download MNIST CSV files to examples/data/
cargo run --example mnist_csv --features gpu --release

Embedded Quick Test:

cargo run --example mnist_embedded --features gpu

GPU Benchmarking

cargo run --example benchmark --features gpu --release

Compares CPU vs GPU performance on various network sizes.

Typed I/O

cargo run --example typed_io

Demonstrates type-safe input/output using the InputSpec/OutputSpec traits.

Advanced Usage

Custom Loss Functions

Implement the Transformer trait for custom loss computation:

use iax::io::Transformer;

struct MyTransformer;

impl Transformer for MyTransformer {
    fn loss(&self, predicted: &[f32], target: &[f32]) -> f32 {
        // Custom loss computation
        predicted.iter()
            .zip(target.iter())
            .map(|(p, t)| (p - t).abs())
            .sum()
    }
    
    fn gradient(&self, predicted: &[f32], target: &[f32]) -> Vec<f32> {
        // Custom gradient computation
        predicted.iter()
            .zip(target.iter())
            .map(|(p, t)| if p > *t { 1.0 } else { -1.0 })
            .collect()
    }
}

Type-Safe I/O

Use InputSpec and OutputSpec for compile-time type checking:

use iax::io::{InputSpec, OutputSpec, InputData, OutputData};

#[derive(Clone)]
struct MyInput {
    features: Vec<f32>,
}

impl InputData for MyInput {
    fn to_f32(&self) -> Vec<f32> {
        self.features.clone()
    }
}

impl InputSpec for MyInput {
    type Data = MyInput;
}

// Use with network
let input = MyInput { features: vec![0.0, 1.0] };
let output: Vec<f32> = nn.predict_typed(&input);

Accessing Weights

// Get weights for a specific layer (downloads to CPU)
if let Some(weights) = nn.get_layer_weights(1) {
    println!("Layer 1 weights: {:?}", weights);
}

// Set a specific weight
nn.set_weight(1, 0, 0, 0.5)?;  // layer, from_neuron, to_neuron, value

// Save/load weights
let bytes = nn.save_weights();
nn.load_weights(&bytes)?;

Performance Tips

Use GPU for Large Networks: Networks with >1000 neurons benefit significantly from GPU acceleration
Batch Size: Larger batches improve GPU utilization but require more memory
Full Batch Training: On GPU, train_batched() with full batch is often faster than mini-batches
Learning Rate: Start with 0.01-0.1 for full-batch training, 0.001-0.01 for mini-batch
Momentum: Use 0.9 for most cases, helps with convergence

Architecture

Backend Trait: Abstract computation operations (Backend)
NeuralNetwork: Generic over backend type

Zero-Copy Design: Backend owns all buffers, no CPU-GPU copies during training

Command Batching: GPU operations are batched for optimal throughput

License

MIT

Contributing

Contributions welcome! Please open an issue or pull request.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
examples		examples
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

License

sobeit-the-one/iax

Folders and files

Latest commit

History

Repository files navigation

iax - A Hyper-Simple Neural Network Engine

Features

Quick Start

Basic Usage

GPU Acceleration

Installation

GPU Requirements

Network Architecture

Building Networks

Available Activations

Training

Training Configuration

Training Methods

Loss Functions

Data Preparation

From Pairs

From Flattened Arrays

Loading from Files

GPU-Specific Features

Batch Size Estimation

Command Batching

Examples

Basic XOR Example

MNIST Classification

GPU Benchmarking

Typed I/O

Advanced Usage

Custom Loss Functions

Type-Safe I/O

Accessing Weights

Performance Tips

Architecture

License

Contributing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages