Face Detection with RLHF (Reinforcement Learning from Human Feedback)

A comprehensive deep learning project that implements an advanced face detection system combining transfer learning with human feedback for continuous improvement. While the model achieves perfect metrics in controlled environments, real-world applications present diverse challenges that require adaptive learning. This project implements RLHF (Reinforcement Learning from Human Feedback) to bridge this gap, creating a system that learns from real-world usage patterns.

The implementation leverages MobileNetV2's architecture as its backbone and performs dual tasks: face detection with confidence scoring and precise bounding box prediction. Through a custom-built GUI application, users can detect faces in real-time and provide feedback on the model's performance. This feedback is systematically collected and analyzed through a two-phase training approach that prioritizes challenging cases, ensuring continuous improvement in real-world scenarios such as varying lighting conditions, different face angles, and occlusions.

What sets this project apart is its end-to-end implementation of the RLHF concept in computer vision, specifically designed to enhance model generalization. While traditional face detection models remain static after training, this system creates a continuous improvement loop where human feedback directly influences model behavior. The implementation includes comprehensive metrics tracking, automated parameter adjustment based on feedback patterns, and a structured approach to model enhancement through grid search optimization and RLHF-based fine-tuning.

The results demonstrate significant improvements in model generalization, with the RLHF-improved model showing enhanced performance particularly in bounding box precision (57% improvement in MSE) and overall loss reduction (64% improvement), while maintaining perfect classification metrics. This approach effectively bridges the gap between laboratory performance and real-world application, creating a more robust and adaptable face detection system.

Overview

This project implements a sophisticated face detection system that combines transfer learning with human feedback for continuous improvement. Built on MobileNetV2's architecture, the system performs dual tasks: face detection with confidence scoring and precise bounding box prediction, achieving robust performance through a carefully designed training pipeline.

The implementation features three key components:

Transfer Learning Model: Leverages MobileNetV2's pre-trained weights, adapted for face detection through a custom dual-head architecture for classification and bounding box regression.
RLHF Pipeline: Implements a systematic approach to collect and utilize human feedback, enabling continuous model improvement through a two-phase training strategy.
Interactive GUI: Provides a user-friendly interface for real-time face detection and feedback collection, creating a seamless loop between model predictions and user interactions.

The project is trained on a balanced dataset of 11,985 images, with a comprehensive evaluation system that tracks both traditional metrics and user feedback. Through the RLHF implementation, the model adapts to challenging cases and improves its performance based on real-world usage.

Results

Dataset Results

RLHF Results

Real World Results

Project Structure

face_detection/
├── Data/                      
│   ├── Test/   
│       ├── Images/                  # Test set images
│           ├── x_y.jpg
│           └── ...
│       └── Labels/                  # Test set annotations
│           ├── x_y.json
│           └── ...
│   ├── Train/  
│       ├── Images/                  # Training set images
│           ├── x_y.jpg
│           └── ...
│       └── Labels/                  # Training set annotations
│           ├── x_y.json
│           └── ...
│   ├── Validation/ 
│       ├── Images/                  # Validation set images
│           ├── x_y.jpg
│           └── ...
│       └── Labels/                  # Validation set annotations
│           ├── x_y.json
│           └── ...
│   └── Data.csv                     # Dataset metadata and specifications
│
├── feedback/                        # Feedback system
│   ├── criteria.txt                 # Feedback evaluation criteria
│   ├── feedback_data.json           # Collected feedback data
│   ├── feedback_metrics.json        # Feedback analysis metrics
│   └── verify_feedback.py           # Feedback verification tools
│
├── grid_search_results/             # Hyperparameter optimization
│   ├── combination_1.json           # Individual trial results
│   ├── grid_search_results.csv      # Results summary
│   └── grid_search.log             # Training logs
│
├── models/                          # Trained models
│   ├── face_detection_XXXXXX/       # Model versions
│       ├── best_weights.weights.h5  # Best model weights
│       ├── evaluation_results.png   # Performance visualizations
│       ├── parameters.json          # Model parameters
│       └── training_history.json    # Training metrics
│
├── results/                         # Evaluation results
│   ├── best_model_improved_results/ # RLHF-improved model results
│       ├── orignal_dataset/         # Results on original data
│       ├── real_world_dataset/      # Results on real-world tests
│       └── rlhf_dataset/           # Results on RLHF data
│   └── rlhf/                       # RLHF analysis
│       └── analysis_feedback.png    # Feedback visualizations
│
├── rlhf/                           # RLHF implementation
│   ├── data/                       # RLHF training data
│   ├── augmentation.py             # Data augmentation
│   ├── dataset_creator.py          # Dataset management
│   ├── model_improver.py           # Model improvement
│   └── utils.py                    # Utility functions
│
├── scripts/                        # Training scripts
│   ├── train_gridSearch.py         # Grid search implementation
│   └── train.py                    # Base training script
│
├── src/                           # Core implementation
│   ├── feedback/                  # Feedback collection
│   ├── gui/                      # GUI implementation
│   ├── model/                    # Model architecture
│   └── utils/                    # Utility functions
│
└── requirements.txt               # Project dependencies

Key Components

Data Organization
- Structured dataset splits with images and labels
- Comprehensive metadata tracking
- Standardized annotation format
Model Development
- Grid search optimization
- Multiple model versions
- Training and evaluation scripts
- Performance tracking
RLHF System
- Feedback collection and analysis
- Model improvement pipeline
- Results visualization
- Data augmentation
User Interface
- Interactive GUI application
- Real-time detection
- Feedback submission
- Result visualization

This structure ensures modular development, easy maintenance, and systematic tracking of experiments and improvements.

Dataset

The project utilizes a carefully curated dataset combining images from two renowned sources: Labeled Faces in the Wild (LFW) and Jack Dataset, creating a balanced collection of 11,985 images for face detection training.

Dataset Distribution

Set	Total Images	% of Dataset	Face Images	% Faces	No Face Images	% No Faces
Train	9588	80.00%	4794	50.00%	4794	50.00%
Test	1197	9.99%	598	49.96%	599	50.04%
Validation	1200	10.01%	600	50.00%	600	50.00%
Total	11985	100.00%	5992	50.00%	5993	50.00%

Sample Images

LFW Dataset (Face Examples)

Jack Dataset (Non-face Examples)

Key Characteristics

Image Format: JPG
Dimensions: 250x250 pixels (standardized)
Color Space: RGB
Class Balance: Near-perfect (50% faces, 50% non-faces)
Split Ratio: 80% train, 10% validation, 10% test
Annotations: Bounding box coordinates in JSON format

The dataset's balanced nature and diverse composition provide a solid foundation for training a robust face detection model, while its standardized format ensures consistent processing throughout the pipeline.

Model Architecture

The face detection model is built using transfer learning with MobileNetV2 as the backbone, implementing a dual-head architecture for simultaneous face classification and bounding box regression. The model is designed to be efficient while maintaining high accuracy in both tasks.

Base Model

Backbone: MobileNetV2 (pre-trained on ImageNet)
Input Shape: 224×224×3 (RGB images)
Feature Extraction: Global Max Pooling on backbone output
Trainable Base: False (frozen weights for transfer learning)

Dual-Head Architecture

Classification Branch:

# First Dense Block
Dense(1024) → BatchNorm → ReLU → Dropout
# Second Dense Block
Dense(512) → BatchNorm → ReLU → Dropout
# Output
Dense(1, sigmoid) # Face/No-Face Classification

Regression Branch:

# First Dense Block
Dense(1024) → BatchNorm → ReLU → Dropout
# Second Dense Block
Dense(512) → BatchNorm → ReLU → Dropout
# Output
Dense(4, sigmoid) # Bounding Box Coordinates [x1, y1, x2, y2]

Loss Functions

Classification Loss:
- Binary Cross-Entropy with label smoothing (0.1)
- Helps prevent overconfident predictions

Regression Loss:

def regression_loss(y_true, y_pred):
    # Coordinate difference
    delta_coord = tf.reduce_sum(tf.square(y_true[:,:2] - y_pred[:,:2]))
    # Size difference
    h_true = y_true[:,3] - y_true[:,1]
    w_true = y_true[:,2] - y_true[:,0]
    h_pred = y_pred[:,3] - y_pred[:,1]
    w_pred = y_pred[:,2] - y_pred[:,0]
    delta_size = tf.reduce_sum(tf.square(w_true - w_pred) + 
                             tf.square(h_true - h_pred))
    return delta_coord + delta_size

Training Configuration

# Parameters
learning_rate 
batch_size
epochs
class_weight 
reg_weight 
dropout_rate

Optimization Strategy

Optimizer: Adam with learning rate decay

lr_schedule = learning_rate * (decay_rate ^ epoch)

Regularization:
- L2 regularization (0.02) on dense layers
- Dropout (0.5) after each dense layer
- Batch Normalization for stable training
Early Stopping:
- Monitor: validation total loss
- Patience: 5 epochs
- Restore best weights

Model Callbacks

Early Stopping: Prevents overfitting
Model Checkpoint: Saves best weights
Learning Rate Scheduler: Implements decay
CSV Logger: Tracks training metrics
Training Time Tracker: Monitors duration

Prediction Pipeline

def predict(image_path, threshold=0.5):
    # Image preprocessing
    img = load_and_preprocess(image_path)
    
    # Model prediction
    class_pred, bbox_pred = model(img)
    
    # Threshold-based detection
    has_face = class_pred >= threshold
    
    return {
        'has_face': has_face,
        'confidence': class_prob,
        'bbox': bbox_coords if has_face else None
    }

The architecture is designed to balance accuracy and efficiency, making it suitable for real-time face detection while maintaining robust performance in both classification and localization tasks.

You're right. Let me revise the Training with Grid Search section with the correct number of combinations and include the best model results:

Training with Grid Search

To improve the model's performance, we implemented a comprehensive grid search over key hyperparameters. The search explored 24 different combinations (2×3×1×2×2×1×1×1×1 = 24) of parameters to find the most effective configuration.

Hyperparameter Grid

param_grid = {
    'class_weight': [0.1, 0.2],           # Balance between classification and regression
    'reg_weight': [1.7, 1.8, 1.9],        # Importance of bounding box accuracy
    'learning_rate': [0.0001],            # Initial learning rate
    'batch_size': [96, 128],              # Training batch size
    'dropout_rate': [0.6, 0.7],           # Regularization strength
    'epochs': [25],                       # Maximum training epochs
    'early_stopping_patience': [7],        # Epochs before early stopping
    'reduce_lr_patience': [4],            # Epochs before LR reduction
    'lr_decay_rate': [0.9]                # Learning rate decay factor
}

Best Model Configuration

The grid search identified the optimal configuration (Combination ID: 13):

best_params = {
    'class_weight': 0.2,
    'reg_weight': 1.7,
    'learning_rate': 0.0001,
    'batch_size': 96,
    'dropout_rate': 0.6,
    'epochs': 25,
    'early_stopping_patience': 7,
    'reduce_lr_patience': 4,
    'lr_decay_rate': 0.9
}

Performance Metrics

The best model achieved exceptional results:

Classification Performance:
- Accuracy: 1.0000
- Loss: 0.2028
- Precision: 1.0000
- Recall: 1.0000
- F1 Score: 1.0000
Regression Performance:
- MAE: 0.1476
- MSE: 0.0653
- RMSE: 0.2556
- Total Loss: 1.4716
Training Characteristics:
- Training Time: 7.38 minutes
- Early Stopping: Yes (at epoch 24)
- Final Validation Loss: 1.4733

Key Findings

Parameter Sensitivity:
- Higher class_weight (0.2) improved detection stability
- Moderate reg_weight (1.7) provided best bbox accuracy
- Lower batch size (96) offered better generalization
Model Behavior:
- Perfect classification accuracy on validation set
- Strong bounding box prediction (MAE: 0.1476)
- Efficient training convergence (early stopping at 24/25 epochs)
Test Set Performance:
- Maintained perfect classification (Accuracy: 1.0)
- Strong regression metrics (MAE: 0.1500)
- Robust F1 Score (1.0)

These results demonstrate the effectiveness of the grid search in finding a balanced configuration that excels in both face detection and bounding box regression tasks.

Best Model Results

Test Set

Thank you for providing these details. Let me update the Analysis section with the actual metrics:

You're right. Let me revise the RLHF Implementation section to include the complete process:

RLHF Implementation

Despite achieving excellent performance in controlled environments through transfer learning and grid search optimization (accuracy: 1.000, precision: 1.000), the model faced challenges in real-world scenarios. The implementation of Reinforcement Learning from Human Feedback (RLHF) aims to bridge this gap, creating a continuous improvement loop that adapts to real-world conditions such as varying lighting, different face angles, diverse image qualities, and occlusions.

The RLHF process follows three stages:

Feedback Collection: Through a GUI interface where users evaluate model predictions, provide correct bounding boxes, rate performance, and add comments.
Feedback Analysis: Systematic evaluation of performance patterns, failure modes, user ratings, detection confidence, and bounding box accuracy.
Model Improvement: Targeted enhancement through automatic strategy determination and priority-based training.

Stage 1: Feedback Collection

GUI Interface:
- Custom interface built with CustomTkinter
- Model and image selection capabilities
- Real-time face detection visualization
- Interactive bounding box correction tool
- Rating system (0-5 scale)
- Comments section for additional feedback

Feedback Collection Process:
- Used best model from Grid Search (Combination ID: 13)
- Selected 100 external images for evaluation
- For each image:
  - Model makes prediction
  - User draws correct bounding box
  - Provides rating and comments
  - Feedback saved in JSON format

Feedback Structure:

feedback_data = {
    'image_path': str,
    'model_name': str,
    'model_prediction': {
        'has_face': bool,
        'confidence': float,
        'bbox': List[float]
    },
    'human_correction': List[float],
    'rating': float,
    'comments': str,
    'timestamp': str,
    'image_size': Tuple[int, int]
}

Stage 2: Feedback Analysis

After collecting feedback from 100 images through the GUI interface, the analysis revealed significant insights about the model's performance:

Overall Performance Metrics:

metrics = {
    'total_feedback': 100,
    'average_rating': 2.0,        # Below average performance
    'average_confidence': 0.622,  # Moderate confidence
    'average_iou': 0.317         # Low IoU score
}

IoU Distribution by Quality:

iou_ranges = {
    'excellent': 15,  # IoU >= 0.4
    'good': 24,      # 0.3 <= IoU < 0.4
    'fair': 20,      # 0.2 <= IoU < 0.3
    'poor': 10       # IoU < 0.2
}

Classification Performance:

classification_metrics = {
    'true_positives': 39,
    'false_positives': 30,
    'false_negatives': 31,
    'precision': 0.565,
    'recall': 0.557,
    'f1_score': 0.561
}

Rating Criteria

The feedback was collected using a standardized 5-point scale:

Rating	Criteria
5	Perfect detection and bbox (100% of face)
4	Good detection, minor bbox issues (>66% of face)
3	Correct detection, noticeable bbox issues (>33% of face)
2	Correct detection, poor bbox (<33% of face)
1	Poor detection and bbox (<33% face, <50% detection)
0	Completely wrong (<25% detection, wrong bbox)

Performance Analysis

Detection Issues:
- High false negative rate (31%)
- Significant false positives (30%)
- Balanced but low precision-recall trade-off
Bounding Box Quality:
- Only 15% achieved excellent IoU
- 44% good or excellent performance
- 30% fair or poor performance
Temporal Trends:
- Initial average rating: 2.7
- Final average rating: 2.2
- Declining performance on challenging cases
- IoU fluctuation between 0.25-0.45

These findings led to specific strategy adjustments in the improvement phase, particularly focusing on:

Reducing false negatives
Improving bounding box precision
Enhancing confidence calibration

Stage 3: Model Improvement

The RLHF implementation employs a systematic approach to improve model performance through feedback analysis and targeted training. The process consists of three main components: automatic strategy determination, phased training implementation, and performance evaluation.

Automatic Strategy Determination

The system analyzes failure patterns in the feedback data to automatically determine the optimal training strategy. It considers four potential scenarios:

Poor bounding box performance (>40% of failures): Emphasizes regression with higher reg_weight
Low confidence issues (>40% of failures): Focuses on classification with higher class_weight
High confidence errors (>40% of failures): Addresses false positives with adjusted learning rate
Balanced issues: Uses moderate parameters across all aspects

In this case, the analysis revealed a distributed pattern of issues:

failure_patterns = {
    'low_confidence': 31,
    'high_confidence_wrong': 9,
    'poor_bbox': 25,
    'false_positives': 1,
    'false_negatives': 31
}

Since no single failure pattern exceeded the 40% threshold, the system selected a balanced approach with the following parameters:

strategy = {
    'epochs': 40,
    'batch_size': 48,
    'early_stopping_patience': 12,
    'reduce_lr_patience': 5,
    'lr_decay_rate': 0.98,
    'class_weight': 0.5,    
    'reg_weight': 2.0,      
    'learning_rate': 1e-4,  
    'dropout_rate': 0.6     
}

Two-Phase Training Implementation

The improvement process implements a two-phase training approach to maximize the impact of feedback:

Phase 1 (Priority Training):

Focuses on samples with ratings ≤ 2 (41 original samples)
Applies aggressive augmentation (7 variations per sample)
Emphasizes learning from problematic cases
Uses 18 validation samples for performance monitoring

Phase 2 (Comprehensive Training):

Includes all feedback samples (82 original samples)
Applies standard augmentation (5 variations per sample)
Ensures balanced learning from all feedback types
Maintains consistent validation set for comparison

Performance Results

The implementation achieved significant improvements across all metrics:

final_metrics = {
    # Classification Metrics
    'class_accuracy': 1.000,    
    'class_precision': 1.000,   
    'class_recall': 1.000,      
    'f1_score': 1.000,         
    
    # Regression Metrics
    'reg_mae': 0.114,          
    'reg_mse': 0.028,          
    'reg_rmse': 0.167,         
    
    # Overall Performance
    'total_loss': 3.110,       
    'class_loss': 0.240,       
    'reg_loss': 1.495         
}

These results demonstrate the effectiveness of our RLHF implementation in:

Achieving perfect classification performance
Significantly improving bounding box precision
Maintaining balanced overall performance
Successfully addressing identified failure patterns

The balanced strategy, automatically determined through feedback analysis, proved highly effective in improving both the classification accuracy and bounding box precision of the model.

Before Improvement

After Improvement

Results

The evaluation demonstrates the model's evolution through grid search optimization and RLHF improvement.

1. Best Model (Grid Search)

Grid search optimization (Combination ID: 13) achieved optimal performance with parameters:

best_params = {
    'class_weight': 0.2,
    'reg_weight': 1.7,
    'learning_rate': 0.0001,
    'batch_size': 96,
    'dropout_rate': 0.6,
    'epochs': 25,
    'early_stopping_patience': 7,
    'reduce_lr_patience': 4,
    'lr_decay_rate': 0.9
}

Performance metrics:

grid_search_metrics = {
    # Classification Performance
    'test_class_accuracy': 1.000,
    'test_class_precision': 1.000,
    'test_class_recall': 1.000,
    'test_f1_score': 1.000,
    
    # Regression Performance
    'test_reg_mae': 0.150,
    'test_reg_mse': 0.065,
    'test_reg_rmse': 0.255,
    
    # Overall Performance
    'test_total_loss': 8.684,
    'test_class_loss': 0.207,
    'test_reg_loss': 5.084
}

Dataset Results

RLHF Results

Real World Results

2. Best Model Improved (RLHF)

After RLHF implementation with balanced strategy:

rlhf_strategy = {
    'epochs': 40,
    'batch_size': 48,
    'early_stopping_patience': 12,
    'reduce_lr_patience': 5,
    'lr_decay_rate': 0.98,
    'class_weight': 0.5,    
    'reg_weight': 2.0,      
    'learning_rate': 1e-4,  
    'dropout_rate': 0.6     
}

Final performance:

rlhf_metrics = {
    # Classification Performance
    'test_class_accuracy': 1.000,
    'test_class_precision': 1.000,
    'test_class_recall': 1.000,
    'test_f1_score': 1.000,
    
    # Regression Performance
    'test_reg_mae': 0.114,          # 24% improvement
    'test_reg_mse': 0.028,          # 57% improvement
    'test_reg_rmse': 0.167,         # 34% improvement
    
    # Overall Performance
    'test_total_loss': 3.110,       # 64% improvement
    'test_class_loss': 0.240,
    'test_reg_loss': 1.495          # 71% improvement
}

Dataset Results

RLHF Results

Real World Results

The RLHF implementation significantly improved the model's performance, particularly in bounding box precision and overall loss reduction, while maintaining perfect classification metrics.

GUI Application

The GUI application serves as the interface for both face detection and feedback collection, built using CustomTkinter for a modern, user-friendly experience.

Main Features

Model Selection and Configuration
- Model loading functionality
- Detection threshold adjustment (0-1)
- Real-time parameter updates
Image Processing
- Image loading and display
- Real-time face detection
- Bounding box visualization
- Detection confidence display
Feedback Collection
- Toggle feedback mode
- Interactive bounding box drawing
- Rating system (0-5 scale)
- Comments section
- Automatic feedback storage

Usage Instructions

Model Loading:
- Click "Select Model"
- Choose model directory containing weights
- Model name and status displayed
Image Processing:
- Click "Select Image"
- Adjust detection threshold if needed
- View detection results:
  - Face detection status
  - Confidence score
  - Bounding box coordinates
Feedback Submission:
- Enable feedback mode
- Draw correction box
- Rate model performance (0-5)
- Add optional comments
- Submit feedback

The interface provides a seamless workflow for both model evaluation and continuous improvement through user feedback.

Installation

Prerequisites

Python 3.8+ (3.10.15 recommended for this project)
CUDA-capable GPU
Git

Environment Setup

Clone the repository:

git clone https://github.com/AlvaroVasquezAI/Face_Detection.git
cd Face_Detection

Create and activate a virtual environment:

# Windows
python -m venv venv
venv\Scripts\activate

# Linux/Mac
python3 -m venv venv
source venv/bin/activate

Install dependencies:

pip install -r requirements.txt

Verify Installation

# Test GPU support
python -c "import tensorflow as tf; print('GPU Available:', tf.config.list_physical_devices('GPU'))"

For any installation issues, please refer to:

Usage

Model Training

Grid Search Training:

python -m scripts.train_gridSearch

This will:

Load and preprocess the dataset
Perform hyperparameter optimization
Save results in grid_search_results_xxxx/

RLHF Process

Collect Feedback Data (Required First Step):

python -m src.gui.app

Using the GUI:

Load best model from grid search
Process multiple images (recommended: 100+)
For each image:
- Draw correction boxes
- Rate model performance (0-5)
- Provide feedback
Feedback saved in feedback/feedback_data.json

Verify Collected Feedback:

python -m feedback.verify_feedback

This will:

Display collected feedback visualizations
Show bounding box comparisons
Present rating distributions

RLHF Training:

python -m rlhf.analysis_and_retrain

This will:

Analyze collected feedback data
Determine optimal strategy
Retrain model with feedback

Face Detection Application

Launch GUI:

python -m src.gui.app

Load Model:

Click "Select Model"
Navigate to models/ directory
Select model folder containing best_weights.weights.h5

Process Images:

Click "Select Image"
Adjust detection threshold if needed (default: 0.5)
View results in real-time

Provide Feedback:

Enable feedback mode
Draw correction box if needed
Rate performance (0-5)
Add comments (optional)
Submit feedback

License

MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software.

The Software is provided "AS IS", without warranty of any kind. For the full license text, please see the LICENSE file in the repository.

The Labeled Faces in the Wild dataset used in this project is subject to its own licensing terms. Please refer to the LFW dataset website.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Data		Data
feedback		feedback
grid_search_results_20250130_105643		grid_search_results_20250130_105643
models		models
results		results
rlhf		rlhf
scripts		scripts
src		src
LICENSE		LICENSE
README.md		README.md
project_structure.txt		project_structure.txt
requirements.txt		requirements.txt

License

AlvaroVasquezAI/Face_Detection

Folders and files

Latest commit

History

Repository files navigation

Face Detection with RLHF (Reinforcement Learning from Human Feedback)

Table of Contents

Overview

Results

Dataset Results

RLHF Results

Real World Results

Project Structure

Key Components

Dataset

Dataset Distribution

Sample Images

LFW Dataset (Face Examples)

Jack Dataset (Non-face Examples)

Key Characteristics

Model Architecture

Base Model

Dual-Head Architecture

Loss Functions

Training Configuration

Optimization Strategy

Model Callbacks

Prediction Pipeline

Training with Grid Search

Hyperparameter Grid

Best Model Configuration

Performance Metrics

Key Findings

Best Model Results

Test Set

RLHF Implementation

Stage 1: Feedback Collection

Stage 2: Feedback Analysis

Rating Criteria

Performance Analysis

Stage 3: Model Improvement

Before Improvement

After Improvement

Results

1. Best Model (Grid Search)

Dataset Results

RLHF Results

Real World Results

2. Best Model Improved (RLHF)

Dataset Results

RLHF Results

Real World Results

GUI Application

Main Features

Usage Instructions

Installation

Prerequisites

Environment Setup

Verify Installation

Usage

Model Training

RLHF Process

Face Detection Application

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages