Stereo Vision 3D Point Cloud Generator

A high-performance C++ application for generating 3D point clouds from stereo camera images using GPU acceleration (CUDA for NVIDIA or HIP for AMD GPUs).

🚀 Quick Start

# Quick setup and run
git clone https://github.com/username/stereo-vision-app.git
cd stereo-vision-app
./setup_dev_environment.sh
./run.sh

✨ Benefits

One-command startup: ./run.sh up launches the stack with sensible defaults.
Browser-based GUI (noVNC) option works cross-platform; native X11 GUI also available on Linux.
Consistent dev/prod environment via Docker and Compose profiles.
GPU-ready: toggle NVIDIA/AMD with ENABLE_CUDA/ENABLE_HIP at build time.
Persistent data and logs: host directories ./data and ./logs are bind-mounted to /app/data and /app/logs and survive rebuilds/restarts.
Reproducible builds with BuildKit caching and clean isolation from host toolchains.

📊 Project Stats

Features

📋 Manual Calibration Wizard: Interactive step-by-step calibration guide (✅ Now Available!)
🤖 AI Auto-Calibration: Intelligent automatic calibration with quality assessment (✅ Fully Functional)
🧠 Enhanced Neural Matcher: Real AI stereo matching with ONNX Runtime integration (✅ Just Added!)
⚡ Multi-Model Support: HITNet, RAFT-Stereo, CREStereo with adaptive selection
🚀 TensorRT Optimization: GPU-accelerated neural inference for maximum performance
Real-time Stereo Vision: GPU-accelerated stereo matching algorithms
⚡ Live Processing: Real-time disparity mapping and 3D reconstruction
Webcam Capture Integration: Direct capture from USB/built-in cameras with device selection
Live Camera Preview: Real-time preview from left and right cameras
Synchronized Capture: Capture perfectly synchronized stereo image pairs
🎯 Single Camera Mode: Manual stereo capture workflow for single camera setups
Cross-Platform GPU Support: NVIDIA CUDA and AMD HIP backends
3D Point Cloud Generation: Convert disparity maps to dense point clouds
📊 Performance Monitoring: Real-time FPS and processing quality metrics
Interactive GUI: User-friendly interface for parameter tuning and visualization
Multiple Export Formats: Support for PLY, PCD, and other point cloud formats

📋 Manual Calibration Wizard ✅ Now Available!

Comprehensive interactive calibration wizard with professional-grade features:

🎯 Key Features

Step-by-step guided workflow with clear instructions at each stage
Live pattern detection with visual feedback and quality assessment
Multiple calibration patterns (chessboard, circles, asymmetric circles)
Real-time quality metrics for optimal frame selection
Interactive frame review with thumbnail gallery and detailed analysis
Professional results display with comprehensive calibration data

🚀 Quick Start Guide

Start Camera: Camera → Start Left Camera
Launch Wizard: Process → Calibrate Cameras (Ctrl+C)
Configure Pattern: Set your calibration pattern type and dimensions
Capture Frames: Follow guided frame capture with live feedback
Review Quality: Examine captured frames and remove poor quality ones
Generate Results: Automatic calibration computation with error analysis

🤖 AI Auto-Calibration ✅ Fully Functional

Advanced AI-powered calibration system that automatically detects and captures optimal calibration frames:

🎯 Features

Automatic Chessboard Detection: Real-time detection with quality assessment
Intelligent Frame Selection: AI selects frames with optimal pose diversity
Quality Metrics: Multi-factor quality scoring (sharpness, coverage, uniformity)
Progress Monitoring: Real-time feedback on calibration progress
Single & Stereo Support: Works with both single camera and stereo camera setups
Configurable Parameters: Adjustable quality thresholds and capture settings

🎮 Usage

Start Capture: Begin webcam capture from configured cameras
Launch AI Calibration: Process → AI Auto-Calibration (Ctrl+Alt+C)
Position Chessboard: Move 9x6 chessboard through various positions and orientations
Automatic Collection: AI automatically captures 20+ optimal frames
Calibration Complete: Parameters automatically calculated and ready for use

� Calibration Methods Comparison

Feature	�📋 Manual Wizard	🤖 AI Auto-Calibration
User Control	✅ Full control over each frame	⚡ Automated frame selection
Pattern Support	✅ Multiple pattern types	🔧 Chessboard only
Learning Curve	📚 Educational, step-by-step	🚀 Instant results
Quality Control	🎯 Manual frame review	🤖 AI quality assessment
Time Required	⏱️ 5-10 minutes	⚡ 2-3 minutes
Best For	📖 Learning, precision control	🏃 Quick setup, beginners

Recommendation: Use the Manual Wizard for learning calibration concepts and precise control, or AI Auto-Calibration for quick, reliable results.

The interactive manual calibration wizard is under development and will provide:

📋 Step-by-Step Guidance: Intuitive wizard interface with clear instructions
🎯 Interactive Detection: Live corner detection with manual refinement tools
📊 Quality Visualization: Real-time quality metrics and coverage analysis
🔄 Multiple Patterns: Support for various calibration patterns
💾 Advanced Management: Comprehensive parameter saving and validation

Current Recommendation: Use the AI Auto-Calibration feature for immediate calibration needs.

🧠 Enhanced Neural Matcher ✅ Just Added!

Revolutionary AI-powered stereo matching with real neural network inference capabilities:

🎯 Key Features

Real Neural Network Inference: Genuine ONNX Runtime integration (no more placeholders!)
Multiple Model Architecture Support: HITNet (speed), RAFT-Stereo (accuracy), CREStereo (balanced)
Adaptive Backend Selection: TensorRT optimization with intelligent CPU/GPU fallback
Professional Model Management: Automatic model loading, validation, and error handling
Production-Ready Implementation: Comprehensive logging, error handling, and performance monitoring

🚀 Supported Neural Models

HITNet: High-speed inference optimized for real-time applications
RAFT-Stereo: Maximum accuracy for precision-critical scenarios
CREStereo: Balanced performance for general-purpose stereo matching
Custom Models: Extensible architecture for additional ONNX-compatible models

🎮 Usage

Model Selection: Choose neural model based on speed/accuracy requirements
Automatic Setup: Model manager handles loading and optimization
Real-time Inference: Process stereo pairs with genuine AI acceleration
Quality Monitoring: Live performance metrics and quality assessment
Fallback Support: Seamless fallback to traditional methods if needed

🔧 Technical Implementation

ONNX Runtime 1.15+: Industry-standard neural inference engine
TensorRT Integration: NVIDIA GPU optimization for maximum performance
Smart Memory Management: Efficient model caching and memory optimization
Error Recovery: Robust error handling with graceful degradation
Cross-Platform Support: Windows, Linux, macOS with unified API

Recommendation: Use HITNet for real-time applications, RAFT-Stereo for maximum accuracy, or CREStereo for balanced performance.

⚡ Live Stereo Processing

Real-time stereo vision processing with live disparity mapping and 3D point cloud generation:

🎯 Features

Real-time Disparity Maps: Live computation during webcam capture
3D Point Cloud Generation: Instant 3D reconstruction with color mapping
Performance Monitoring: Live FPS tracking and queue management
GPU Acceleration: Automatic CUDA/HIP acceleration with CPU fallback
Interactive Parameters: Real-time adjustment of processing parameters
Quality Indicators: Live feedback on processing quality and performance

🎮 Usage

Complete Calibration: Ensure cameras are calibrated (manual or AI)
Start Live Processing: Process → Toggle Live Processing (Ctrl+Shift+P)
View Live Results: Switch to "Live Processing" tab for real-time view
Monitor Performance: Watch FPS and quality metrics in status bar
Adjust Parameters: Use parameter panel for real-time fine-tuning

📷 Webcam Capture Integration

The application now supports direct webcam capture for real-time stereo vision processing:

🎯 Features

Camera Device Detection: Automatically detect available USB and built-in cameras
Dual Camera Setup: Configure separate left and right camera devices
Single Camera Mode: Use one camera for manual stereo capture (move camera between shots)
Live Preview: Real-time preview from both cameras simultaneously
Synchronized Capture: Capture perfectly timed stereo image pairs
Device Testing: Test camera connections before starting capture
Flexible Configuration: Support for different camera resolutions and frame rates
Robust Error Handling: Clear feedback on connection issues and permissions

🎮 Usage

Select Cameras: Use File → Select Cameras... to configure camera devices
- Choose different cameras for left and right channels for true stereo
- OR choose the same camera for both to enable single camera manual stereo mode
- Test camera connections with live preview
- Configure camera parameters if needed
Start Live Capture: Use File → Start Webcam Capture (Ctrl+Shift+S)
- Live preview appears in the image display tabs
- Dual camera mode: Both cameras stream at ~30 FPS
- Single camera mode: Same feed shows in both panels for manual positioning
- Real-time feedback on capture status
Capture Images: Multiple capture options available
- Capture Left Image (L key): Save current camera frame as left image
- Capture Right Image (R key): Save current camera frame as right image
- Capture Stereo Pair (Space key): Save synchronized stereo pair
- Single Camera: Move camera between left/right captures for stereo pairs
Stop Capture: Use File → Stop Webcam Capture (Ctrl+Shift+T)

⌨️ Keyboard Shortcuts

Ctrl+Shift+C: Open camera selector dialog
Ctrl+Shift+S: Start webcam capture
Ctrl+Shift+T: Stop webcam capture
L: Capture left image (during capture)
R: Capture right image (during capture)
Space: Capture synchronized stereo pair

🔧 Technical Details

Supported Formats: PNG, JPEG, BMP, TIFF for captured images
Frame Rate: Up to 30 FPS live preview (hardware dependent)
Resolution: Automatic detection of optimal camera resolution
Synchronization: Frame-level synchronization for stereo pairs
File Naming: Automatic timestamped file naming for captured images

🖱️ Mouse Controls

Left Mouse + Drag: Rotate view around the point cloud
Right Mouse + Drag: Pan the camera view
Mouse Wheel: Zoom in/out with smooth scaling
Double Click: Reset view to default position

⌨️ Keyboard Shortcuts

R: Reset view to default position
1: Front view
2: Side view
3: Top view
A: Toggle auto-rotation animation
G: Toggle grid display
X: Toggle coordinate axes

🔧 Advanced Features

🔇 Noise Suppression

Statistical Outlier Removal: Removes noisy points based on statistical analysis
Voxel Grid Filtering: Downsamples point cloud to reduce noise and improve performance
Radius Outlier Removal: Removes isolated points based on neighborhood density
Real-time Preview: See filtering effects immediately
Adjustable Parameters: Fine-tune filtering strength

🎨 Visualization Modes

RGB Color Mode: Display original colors from stereo cameras
Depth Color Mode: Color-code points by distance (blue=near, red=far)
Height Color Mode: Color-code points by Y-coordinate
Intensity Mode: Grayscale visualization based on brightness

🚀 Performance Options

Quality Levels: Fast/Medium/High rendering quality
Smooth Shading: Enhanced visual quality with lighting
Adaptive Point Size: Automatically adjust point size based on distance
Level-of-Detail: Optimize rendering for large point clouds

📊 Real-time Statistics

Point Count: Total number of points in cloud
Depth Range: Minimum and maximum depth values
Noise Level: Percentage of potentially noisy points
Bounding Box: 3D dimensions of the point cloud
Memory Usage: Real-time memory consumption

💾 Export Options

PLY Format: Binary and ASCII variants
PCD Format: Point Cloud Data format
XYZ Format: Simple coordinate format
Image Export: Save current view as image
Video Recording: Capture rotating animations

🎯 Use Cases

3D Reconstruction: Build detailed 3D models from stereo images
Robotics: Navigation and obstacle detection
AR/VR: Content creation for immersive experiences
Research: Academic and industrial computer vision projects
Quality Control: Dimensional analysis and inspection

GPU Support

This project supports both NVIDIA and AMD GPUs:

NVIDIA GPUs: Uses CUDA for acceleration
AMD GPUs: Uses ROCm/HIP for acceleration
CPU Fallback: Automatic fallback to CPU-only mode if no GPU is detected

Technology Stack

Performance Characteristics

Dependencies

Required Libraries

OpenCV (>= 4.5): Computer vision and image processing with CUDA/OpenCL support
PCL (Point Cloud Library >= 1.12): Point cloud processing and visualization
Qt6 (>= 6.0): GUI framework with modern Windows 11 styling
VTK (>= 9.0): Visualization toolkit (dependency of PCL)
CMake (>= 3.18): Build system with AI/ML integration support

AI/ML Runtime Libraries

ONNX Runtime (>= 1.15): Neural network inference engine for stereo matching
TensorRT (>= 8.5, Optional): NVIDIA GPU acceleration for neural models
OpenCV DNN (>= 4.5): Deep neural network support for enhanced stereo vision

GPU Runtime (Optional)

NVIDIA: CUDA Toolkit (>= 11.0) with TensorRT for optimal neural model performance
AMD: ROCm (>= 5.0) with HIP support for GPU acceleration

Installation

📋 Complete setup instructions for Ubuntu, Windows, and macOS are available in docs/SETUP_REQUIREMENTS.md

Quick Setup (Ubuntu/Debian)

For NVIDIA GPUs:

# Run the main setup script (auto-detects NVIDIA)
./setup_dev_environment.sh

For AMD GPUs:

# First run basic setup
./setup_dev_environment.sh

# Then run AMD-specific setup
./setup_amd_gpu.sh

Manual Installation (Ubuntu):

# Install OpenCV
sudo apt update
sudo apt install libopencv-dev

# Install PCL and VTK
sudo apt install libpcl-dev libvtk9-dev

# Install Qt6
sudo apt install qt6-base-dev qt6-opengl-dev qt6-opengl-widgets-dev

# Install additional dependencies
sudo apt install libboost-all-dev libeigen3-dev libglew-dev

# For NVIDIA: Install CUDA (follow NVIDIA's official guide)
# For AMD: Install ROCm (see setup_amd_gpu.sh)

Building

Quick Build (Auto-detection)

# Auto-detects GPU and builds accordingly
./build.sh

Build Scripts Available

./run.sh - Build and run with GUI (default, at project root)
./build_scripts/build.sh - Build only
./build_scripts/build_amd.sh - AMD/HIP specific build
./build_scripts/build_debug.sh - Debug build with symbols

Manual Build with GPU Backend Selection

For NVIDIA GPUs with AI/ML:

mkdir build && cd build
cmake .. -DUSE_CUDA=ON -DUSE_HIP=OFF -DWITH_ONNX=ON -DWITH_TENSORRT=ON
make -j$(nproc)

For AMD GPUs with AI/ML:

mkdir build && cd build
cmake .. -DUSE_CUDA=OFF -DUSE_HIP=ON -DWITH_ONNX=ON -DWITH_TENSORRT=OFF
make -j$(nproc)

CPU-only with Neural Networks:

mkdir build && cd build
cmake .. -DUSE_CUDA=OFF -DUSE_HIP=OFF -DWITH_ONNX=ON -DWITH_TENSORRT=OFF
make -j$(nproc)

Usage

Running the Application

./stereo_vision_app

Camera Calibration

Print a checkerboard pattern (9x6 recommended)
Capture calibration images with both cameras
Use the calibration tool to compute camera parameters

Point Cloud Generation

Load calibrated camera parameters
Capture or load stereo image pairs
Adjust stereo matching parameters
Generate and export point cloud

📁 Project Structure

computer-vision/                # 🎯 Clean, modern project structure with AI/ML integration
├── 📁 src/                     # Source code
│   ├── core/                   # Core algorithms (stereo, calibration)
│   ├── ai/                     # Neural network implementations (Enhanced Neural Matcher)
│   ├── gui/                    # Qt GUI components with modern Windows 11 styling
│   ├── gpu/                    # GPU acceleration (CUDA/HIP)
│   ├── multicam/               # Multi-camera system
│   └── utils/                  # Utility functions
├── 📁 include/                 # Header files (mirrors src/)
│   ├── ai/                     # Neural stereo matching (enhanced_neural_matcher.hpp)
│   ├── gui/                    # GUI component headers
│   ├── multicam/               # Multi-camera headers
│   └── benchmark/              # Performance benchmarking
├── 📁 tests/                   # Unit and integration tests
├── 📁 test_programs/           # 🧪 Standalone test utilities
│   └── README.md               # Test program guide
├── 📁 documentation/           # 📖 Organized project documentation
│   ├── features/               # Feature implementation docs
│   ├── build/                  # Build system documentation
│   └── setup/                  # Environment setup guides
├── 📁 build_scripts/           # ⚙️ Build and utility scripts
│   ├── build*.sh               # Various build configurations
│   ├── setup*.sh               # Environment setup scripts
│   └── README.md               # Script documentation
├── 📁 reports/                 # 📊 Generated reports and benchmarks
│   └── benchmarks/             # Performance benchmark results
├── 📁 archive/                 # � Historical documentation and temp files
│   ├── milestone_docs/         # Completed milestone documentation
│   └── temp_tests/             # Completed Priority 2 test implementations
├── 📁 data/                    # Sample data and calibration files
├── 📁 docs/                    # Technical documentation
├── 📁 logs/                    # 📋 Build and runtime logs
├── 📁 scripts/                 # Utility scripts
├── 📁 cmake/                   # CMake modules
├── 📄 CMakeLists.txt           # Build configuration
├── 📄 README.md                # This file (modern, comprehensive)
├── 📄 PROJECT_MODERNIZATION_STRATEGY.md # Modernization roadmap
└── 🚀 run.sh                   # Main build and run script

📂 Quick Navigation

Start Here: README.md → run.sh
Documentation: documentation/
Test Hardware: test_programs/
Build Issues: build_scripts/ → logs/
Development: src/ → include/
Performance Reports: reports/benchmarks/
Project History: archive/milestone_docs/

🏆 Latest Achievements

✅ Latest AI/ML Enhancements (Just Completed)

🧠 Enhanced Neural Matcher - Advanced AI stereo matching with multiple model support
🚀 ONNX Runtime Integration - Real neural network inference replacing placeholder implementations
🎯 Multiple Model Support - HITNet, RAFT-Stereo, CREStereo with adaptive selection
⚡ TensorRT Optimization - Optional GPU acceleration for maximum performance
🔧 Smart Model Management - Automatic model loading, validation, and fallback handling

🎯 AI/ML Technical Achievements

Enhanced Neural Matcher: Real ONNX Runtime integration with production-ready inference
Model Architecture Support: HITNet (high-speed), RAFT-Stereo (accuracy), CREStereo (balanced)
Adaptive Backend: TensorRT optimization with CPU/GPU fallback handling
Professional API: Clean C++ interface with comprehensive error handling and logging

✅ Priority 2 Features Complete (Previously Completed)

Neural Network Stereo Matching - TensorRT/ONNX backends with adaptive optimization
Multi-Camera Support - Synchronized capture and real-time processing
Professional Installers - Cross-platform packaging framework
Enhanced Performance Benchmarking - Comprehensive testing with HTML/CSV reports

See archive/milestone_docs/PRIORITY2_COMPLETE.md for full details.

📊 Performance Highlights

Enhanced Neural Models: Real-time inference with ONNX Runtime optimization
Neural Networks: 274 FPS (StereoNet), 268 FPS (PSMNet)
Multi-Camera: 473 FPS (2 cameras), 236 FPS (4 cameras)
Latest Reports: Available in reports/benchmarks/

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
.cache/clangd/index		.cache/clangd/index
.copilot		.copilot
.github		.github
api		api
archive		archive
build_scripts		build_scripts
config		config
data/stereo_images		data/stereo_images
docker		docker
docs-env		docs-env
docs		docs
documentation		documentation
gui		gui
include		include
logs		logs
reports		reports
scripts		scripts
site		site
src		src
test_programs		test_programs
tests		tests
tools		tools
.clang-format		.clang-format
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CMakeLists.txt		CMakeLists.txt
CMakePresets.json		CMakePresets.json
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
Universal_Docker_Development_Strategy.ipynb		Universal_Docker_Development_Strategy.ipynb
build.sh		build.sh
docker-compose.yml		docker-compose.yml
docker-compose.yml.new		docker-compose.yml.new
mkdocs.yml		mkdocs.yml
run.sh		run.sh
run.sh.backup		run.sh.backup
run.sh.new		run.sh.new
test_args.cpp		test_args.cpp

License

hkevin01/computer-vision

Folders and files

Latest commit

History

Repository files navigation

Stereo Vision 3D Point Cloud Generator

🚀 Quick Start

✨ Benefits

📊 Project Stats

Features

📋 Manual Calibration Wizard ✅ Now Available!

🎯 Key Features

🚀 Quick Start Guide

🤖 AI Auto-Calibration ✅ Fully Functional

🎯 Features

🎮 Usage

� Calibration Methods Comparison

🧠 Enhanced Neural Matcher ✅ Just Added!

🎯 Key Features

🚀 Supported Neural Models

🎮 Usage

🔧 Technical Implementation

⚡ Live Stereo Processing

🎯 Features

🎮 Usage

📷 Webcam Capture Integration

🎯 Features

🎮 Usage

⌨️ Keyboard Shortcuts

🔧 Technical Details

🖱️ Mouse Controls

⌨️ Keyboard Shortcuts

🔧 Advanced Features

🔇 Noise Suppression

🎨 Visualization Modes

🚀 Performance Options

📊 Real-time Statistics

💾 Export Options

🎯 Use Cases

GPU Support

Technology Stack

Performance Characteristics

Dependencies

Required Libraries

AI/ML Runtime Libraries

GPU Runtime (Optional)

Installation

Quick Setup (Ubuntu/Debian)

For NVIDIA GPUs:

For AMD GPUs:

Manual Installation (Ubuntu):

Building

Quick Build (Auto-detection)

Build Scripts Available

Manual Build with GPU Backend Selection

For NVIDIA GPUs with AI/ML:

For AMD GPUs with AI/ML:

CPU-only with Neural Networks:

Usage

Running the Application

Camera Calibration

Point Cloud Generation

📁 Project Structure

📂 Quick Navigation

🏆 Latest Achievements

✅ Latest AI/ML Enhancements (Just Completed)

🎯 AI/ML Technical Achievements

✅ Priority 2 Features Complete (Previously Completed)

📊 Performance Highlights

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages