OpenCV Starter Pack

Welcome to the OpenCV starter pack. This repository and the quick guide associated with it is not meant to be a comprehensive guide on OpenCV or machine learning by any means, but it should provide some necessary modules to help you get started. This guide will cover facial detection, basics of a convolutional neural network, training process, and real-time emotion detection.

What is OpenCV?

Have you ever wondered how the face ID on your phone works? Or how self-driving cars are able to process large amounts of noisy stimuli and still manage to (mostly) follow the laws of the road? All of this is made possible through an open-source library called OpenCV. It contains over 2500 algorithms related to real-time image processing and it serves as the backbone for a majority of computer vision applications.

Goal

This starter pack should teach you how to use OpenCV along with a machine learning model to create a real-time emotion detector. Hopefully by the end of this guide you will have a general idea of how to use OpenCV with your own projects and have a template to work from. Here are some possible hackathon ideas to give you some inspiration!

Mood-based music player: Change music based on detected emotion
Emotion-aware chatbot: Adjust responses based on user's mood
Accessibility tool: Help people with autism recognize emotions
Gaming integration: Control game mechanics with facial expressions
Mental health tracker: Monitor emotional patterns over time

Quick Start

We recommend the use of Anaconda to create and manage your virtual environment. For information on how to download Anaconda and set up a Conda environment, look here. You can stop before the Channels section.

1. Clone this repo

git clone https://github.com/hackatbrown/opencv-starter.git

2. Install Dependencies

pip install -r requirements.txt

!! DISCLAIMER FOR MAC USERS !!

If you have issues installing the dependencies on MacOS, try the following commands to set up a virtual environment and downgrade your Python installation to a supported version for Tensorflow:

brew install python@3.11
python3.11 -m venv ~/.venvs/opencv-starter
source ~/.venvs/opencv-starter/bin/activate
pip install -r ./requirements.txt

3. Run the Demo

python demo.py

4. Train with your own data (optional)

If you would like to use the model in this database to detect your own custom emotions, upload your data as images in the same format as the file structure below under the data/ folder. This splits your data into a training set and testing set for the model – it is common to use an 80/20 split for the number of images that goes into each set, with more images going into training. The pretrained model included in this codebase was trained on the FER dataset, a public collection of facial expression images, organized by facial expression. If you want to use it yourself, you can find that here.

Project Structure

opencv-starter/
├── requirements.txt           # Python dependencies
├── demo.py                    # Main real-time demo with pre-trained model
├── train.py                   # Script to train an emotion detection model
├── preprocess.py              # Image preprocessing utilities
├── model.py                   # Model architecture definition
├── utils.py                   # General utility functions
├── data/                      # Data directory (training & test images)
│   ├── train/                 # Training images (FER & custom)
│   │   ├── happy/
│   │   ├── sad/
│   │   ├── angry/
│   │   ├── excited/           # Example custom emotion
│   │   ├── confused/          # Example custom emotion
│   │   └── ...
│   └── test/                  # Testing images (FER test set)
│       ├── happy/
│       ├── sad/
│       └── ...
└── models/                    # Saved models and metadata
    ├── best_cnn_model.keras
    └── ...

Module Guide

1. Demo Module (`demo.py`)

The demo module is where everything comes together - it's the real-time emotion detection application that uses both OpenCV and a pretrained model (or your model if you want to train your own).

Video Capture:

cap = cv2.VideoCapture(0)

cv2.VideoCapture(0) opens a connection to your webcam
Think of this as opening a "stream" of images from your camera
Each frame is a single image captured at that moment

Face Detection with Haar Cascades:

face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + "haarcascade_frontalface_default.xml")
faces = face_cascade.detectMultiScale(gray, scaleFactor=1.2, minNeighbors=5, minSize=(40, 40))

What are Haar Cascades?

Haar Cascades are a machine learning-based approach for object detection
They use "features" (patterns of light/dark regions) to identify faces
Think of it like a smart filter that looks for patterns: two eyes, a nose, and a mouth in a certain arrangement
The cascade part means it's a series of filters that get progressively more strict

Image Processing:

gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

cv2.cvtColor(): Converts from BGR (Blue-Green-Red, OpenCV's default) to grayscale
Grayscale is used because Haar Cascades work on grayscale images (faster processing)

Drawing on Images:

cv2.rectangle(frame, (x, y), (x + w, y + h), color, 2)
cv2.putText(frame, f"{label} ({conf*100:.1f}%)", (x, y - 10), 
           cv2.FONT_HERSHEY_SIMPLEX, 0.7, color, 2)

cv2.rectangle(): Draws a rectangle around the detected face
cv2.putText(): Overlays text (emotion label and confidence) on the frame
These are OpenCV's drawing functions that modify the image data

Display Loop:

cv2.imshow("Emotion Detection", frame)
cv2.waitKey(1)

cv2.imshow(): Displays the processed frame in a window
The loop runs continuously, processing ~30 frames per second

General Flow:

Capture → Get frame from webcam
Convert → BGR to grayscale
Detect → Find faces using Haar Cascades
Preprocess → Prepare face region for model
Predict → Get emotion from CNN model
Visualize → Draw boxes and labels
Display → Show result and repeat

2. Model Module (`model.py`)

This module defines the Convolutional Neural Network (CNN) architecture that learns to recognize emotions from facial images. This is nowhere near a complete guide on deep learning, but for more information you might find this tutorial useful.

What is a CNN?

A CNN is a type of neural network designed for image recognition. It is composed of multiple layers that each look for different patterns:

Early layers: Detect simple features (edges, corners)
Middle layers: Detect complex patterns (eyes, mouth shapes)
Later layers: Recognize full facial expressions (smile, frown)

Architecture Breakdown:

Block 1 - Basic Feature Detection:

layers.Conv2D(32, (3, 3), ...)  # 32 filters, each 3x3 pixels

Conv2D (Convolutional Layer): Slides a small "filter" (3x3 window) across the image
Creates 32 different "feature maps" - each looking for different patterns
Like having 32 different "lenses" that each detect something specific

Batch Normalization:

Normalizes the outputs to have consistent ranges
Helps training be more stable and faster
Standardizing data so the network learns better

MaxPooling:

layers.MaxPooling2D((2, 2))

Takes the maximum value from each 2x2 region
Reduces image size by half (48x48 → 24x24)
Makes the network more efficient and helps it recognize patterns regardless of exact position

Dropout:

layers.Dropout(0.25)

Randomly "turns off" 25% of neurons during training
Prevents overfitting (memorizing training data instead of learning patterns)
Makes sure the model learns "conceptually" rather than memorizing the training data

Block 2 & 3:

Same pattern but with more filters (64, then 128)
Each block detects more complex patterns
Image gets progressively smaller but more "abstract"

Dense Layers (Final Classification):

layers.Flatten()  # Converts 2D feature maps to 1D
layers.Dense(512, activation='relu')  # Fully connected layer
layers.Dense(num_classes, activation='softmax')  # Final prediction

Flatten: Converts the 2D feature maps into a 1D list
Dense layers: Fully connected neurons that make the final decision
Softmax: Converts raw scores into probabilities (all emotions sum to 100%)

Why This Architecture?

Convolutional layers: Detect spatial patterns (features that appear together)
Pooling: Makes the network robust to small shifts/rotations
Dropout: Prevents memorization
Multiple blocks: Learns hierarchical features (simple → complex)

3. Training Module (`train.py`)

Now that we have our model, it's time to train! The training module orchestrates the entire machine learning pipeline, from loading data to saving a trained model.

Training Pipeline Overview:

1. Data Discovery:

self.emotions, self.emotion_to_idx, self.idx_to_emotion = discover_emotions(train_dir, test_dir)

Automatically finds all emotion categories in your data folders
Creates mappings: emotion name ↔ number (required for the model)

2. Data Loading:

X_train, y_train, X_test, y_test = load_datasets(...)

Loads images and labels from data/train/ and data/test/
Each image is preprocessed (resized, normalized) using OpenCV functions
Returns numpy arrays ready for training

3. Data Augmentation:

datagen = get_data_augmentation()

Creates variations of training images: rotated, shifted, flipped
Helps the model learn to recognize emotions in different conditions
Provides a greater variety of training data

4. Class Weights:

class_weights = compute_class_weight('balanced', ...)

Handles imbalanced datasets (e.g., more "happy" than "disgust" images)

5. Model Training:

history = model.fit(
    datagen.flow(X_train, y_train_cat, batch_size=batch_size),
    epochs=epochs,
    validation_data=(X_test, y_test_cat),
    callbacks=callbacks,
    class_weight=class_weight_dict
)

Key Concepts:

Batch: Processes multiple images at once (64 at a time) for efficiency
Epoch: One complete pass through all training data
Validation: Tests on unseen data to check if model is learning correctly
Callbacks: Automatically saves best model, stops early if no improvement, adjusts learning rate

6. Model Saving:

Saves the trained model in .keras format
Saves emotion mappings so you know which number = which emotion
Saves metadata (accuracy, emotions, etc.)

4. Preprocessing Module (`preprocess.py`):

Image preprocessing is used for two reasons:

To ensure that the model receives data in the same format each time, no matter which image is used.
To make processes that use image data such as training and live detection more efficient.

Image Preprocessing:

def preprocess_image(img_path, img_size=(48, 48), ...):
    img = cv2.imread(img_path)  # Read image
    img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)  # Convert to grayscale
    img = cv2.resize(img, img_size)  # Resize to 48x48
    img = img.astype('float32') / 255.0  # Normalize to 0-1

Why Each Step?

Reading Images:
- cv2.imread(): Loads image from file into a numpy array
- Images are stored as arrays of pixel values
Grayscale Conversion:
- Color doesn't help emotion detection (facial expressions are about shape, not color)
- Grayscale reduces data size (3 channels → 1 channel)
Resizing:
- All images must be the same size for the CNN (48x48 pixels)
- cv2.resize() uses interpolation to resize without losing too much detail
- Smaller images = faster training
Normalization:
- Converts pixel values from 0-255 to 0.0-1.0
- Neural networks train better with normalized data

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

OpenCV Starter Pack

Table of Contents

Overview

What is OpenCV?

Goal

Quick Start

1. Clone this repo

2. Install Dependencies

!! DISCLAIMER FOR MAC USERS !!

3. Run the Demo

4. Train with your own data (optional)

Project Structure

Module Guide

1. Demo Module (`demo.py`)

General Flow:

2. Model Module (`model.py`)

What is a CNN?

Architecture Breakdown:

Why This Architecture?

3. Training Module (`train.py`)

Training Pipeline Overview:

4. Preprocessing Module (`preprocess.py`):

Training Tips

📚 Resources

Sources

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
images		images
models		models
.gitignore		.gitignore
README.md		README.md
dataset.py		dataset.py
demo.py		demo.py
model.py		model.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt
train.py		train.py
training_utils.py		training_utils.py
utils.py		utils.py

hackatbrown/opencv-starter

Folders and files

Latest commit

History

Repository files navigation

OpenCV Starter Pack

Table of Contents

Overview

What is OpenCV?

Goal

Quick Start

1. Clone this repo

2. Install Dependencies

!! DISCLAIMER FOR MAC USERS !!

3. Run the Demo

4. Train with your own data (optional)

Project Structure

Module Guide

1. Demo Module (demo.py)

General Flow:

2. Model Module (model.py)

What is a CNN?

Architecture Breakdown:

Why This Architecture?

3. Training Module (train.py)

Training Pipeline Overview:

4. Preprocessing Module (preprocess.py):

Training Tips

📚 Resources

Sources

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

1. Demo Module (`demo.py`)

2. Model Module (`model.py`)

3. Training Module (`train.py`)

4. Preprocessing Module (`preprocess.py`):

Packages