Skip to content

A high-performance C++ application for generating 3D point clouds from stereo camera images using GPU acceleration (CUDA for NVIDIA or HIP for AMD GPUs).

License

Notifications You must be signed in to change notification settings

hkevin01/computer-vision

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

72 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Stereo Vision 3D Point Cloud Generator

Build Status C++ CMake OpenCV Qt PCL

CUDA HIP Cross Platform GPU Accelerated

License: MIT Code Style Documentation Tests

Development Status AI Features Contributions Welcome Issues Performance

A high-performance C++ application for generating 3D point clouds from stereo camera images using GPU acceleration (CUDA for NVIDIA or HIP for AMD GPUs).

🚀 Quick Start

Get Started Download Documentation

# Quick setup and run
git clone https://github.com/username/stereo-vision-app.git
cd stereo-vision-app
./setup_dev_environment.sh
./run.sh

✨ Benefits

  • One-command startup: ./run.sh up launches the stack with sensible defaults.
  • Browser-based GUI (noVNC) option works cross-platform; native X11 GUI also available on Linux.
  • Consistent dev/prod environment via Docker and Compose profiles.
  • GPU-ready: toggle NVIDIA/AMD with ENABLE_CUDA/ENABLE_HIP at build time.
  • Persistent data and logs: host directories ./data and ./logs are bind-mounted to /app/data and /app/logs and survive rebuilds/restarts.
  • Reproducible builds with BuildKit caching and clean isolation from host toolchains.

📊 Project Stats

Lines of Code Languages Architecture Test Coverage

Features

Stereo Vision Point Cloud Noise Reduction Interactive 3D Mouse Control Real-time GPU Accelerated Multi-Format 3D Point Cloud Real Time GUI Interface

  • 📋 Manual Calibration Wizard: Interactive step-by-step calibration guide (✅ Now Available!)
  • 🤖 AI Auto-Calibration: Intelligent automatic calibration with quality assessment (✅ Fully Functional)
  • 🧠 Enhanced Neural Matcher: Real AI stereo matching with ONNX Runtime integration (✅ Just Added!)
  • ⚡ Multi-Model Support: HITNet, RAFT-Stereo, CREStereo with adaptive selection
  • 🚀 TensorRT Optimization: GPU-accelerated neural inference for maximum performance
  • Real-time Stereo Vision: GPU-accelerated stereo matching algorithms
  • ⚡ Live Processing: Real-time disparity mapping and 3D reconstruction
  • Webcam Capture Integration: Direct capture from USB/built-in cameras with device selection
  • Live Camera Preview: Real-time preview from left and right cameras
  • Synchronized Capture: Capture perfectly synchronized stereo image pairs
  • 🎯 Single Camera Mode: Manual stereo capture workflow for single camera setups
  • Cross-Platform GPU Support: NVIDIA CUDA and AMD HIP backends
  • 3D Point Cloud Generation: Convert disparity maps to dense point clouds
  • 📊 Performance Monitoring: Real-time FPS and processing quality metrics
  • Interactive GUI: User-friendly interface for parameter tuning and visualization
  • Multiple Export Formats: Support for PLY, PCD, and other point cloud formats

📋 Manual Calibration Wizard ✅ Now Available!

Interactive Step by Step

Comprehensive interactive calibration wizard with professional-grade features:

🎯 Key Features

  • Step-by-step guided workflow with clear instructions at each stage
  • Live pattern detection with visual feedback and quality assessment
  • Multiple calibration patterns (chessboard, circles, asymmetric circles)
  • Real-time quality metrics for optimal frame selection
  • Interactive frame review with thumbnail gallery and detailed analysis
  • Professional results display with comprehensive calibration data

🚀 Quick Start Guide

  1. Start Camera: Camera → Start Left Camera
  2. Launch Wizard: Process → Calibrate Cameras (Ctrl+C)
  3. Configure Pattern: Set your calibration pattern type and dimensions
  4. Capture Frames: Follow guided frame capture with live feedback
  5. Review Quality: Examine captured frames and remove poor quality ones
  6. Generate Results: Automatic calibration computation with error analysis

🤖 AI Auto-Calibration ✅ Fully Functional

AI Powered Auto Detection Quality Assessment Real-time Feedback Production Ready

Advanced AI-powered calibration system that automatically detects and captures optimal calibration frames:

🎯 Features

  • Automatic Chessboard Detection: Real-time detection with quality assessment
  • Intelligent Frame Selection: AI selects frames with optimal pose diversity
  • Quality Metrics: Multi-factor quality scoring (sharpness, coverage, uniformity)
  • Progress Monitoring: Real-time feedback on calibration progress
  • Single & Stereo Support: Works with both single camera and stereo camera setups
  • Configurable Parameters: Adjustable quality thresholds and capture settings

🎮 Usage

  1. Start Capture: Begin webcam capture from configured cameras
  2. Launch AI Calibration: Process → AI Auto-Calibration (Ctrl+Alt+C)
  3. Position Chessboard: Move 9x6 chessboard through various positions and orientations
  4. Automatic Collection: AI automatically captures 20+ optimal frames
  5. Calibration Complete: Parameters automatically calculated and ready for use

� Calibration Methods Comparison

Feature �📋 Manual Wizard 🤖 AI Auto-Calibration
User Control ✅ Full control over each frame ⚡ Automated frame selection
Pattern Support ✅ Multiple pattern types 🔧 Chessboard only
Learning Curve 📚 Educational, step-by-step 🚀 Instant results
Quality Control 🎯 Manual frame review 🤖 AI quality assessment
Time Required ⏱️ 5-10 minutes ⚡ 2-3 minutes
Best For 📖 Learning, precision control 🏃 Quick setup, beginners

Recommendation: Use the Manual Wizard for learning calibration concepts and precise control, or AI Auto-Calibration for quick, reliable results.

The interactive manual calibration wizard is under development and will provide:

  • 📋 Step-by-Step Guidance: Intuitive wizard interface with clear instructions
  • 🎯 Interactive Detection: Live corner detection with manual refinement tools
  • 📊 Quality Visualization: Real-time quality metrics and coverage analysis
  • 🔄 Multiple Patterns: Support for various calibration patterns
  • 💾 Advanced Management: Comprehensive parameter saving and validation

Current Recommendation: Use the AI Auto-Calibration feature for immediate calibration needs.

🧠 Enhanced Neural Matcher ✅ Just Added!

Neural AI ONNX Runtime TensorRT Multi Model Production Ready

Revolutionary AI-powered stereo matching with real neural network inference capabilities:

🎯 Key Features

  • Real Neural Network Inference: Genuine ONNX Runtime integration (no more placeholders!)
  • Multiple Model Architecture Support: HITNet (speed), RAFT-Stereo (accuracy), CREStereo (balanced)
  • Adaptive Backend Selection: TensorRT optimization with intelligent CPU/GPU fallback
  • Professional Model Management: Automatic model loading, validation, and error handling
  • Production-Ready Implementation: Comprehensive logging, error handling, and performance monitoring

🚀 Supported Neural Models

  • HITNet: High-speed inference optimized for real-time applications
  • RAFT-Stereo: Maximum accuracy for precision-critical scenarios
  • CREStereo: Balanced performance for general-purpose stereo matching
  • Custom Models: Extensible architecture for additional ONNX-compatible models

🎮 Usage

  1. Model Selection: Choose neural model based on speed/accuracy requirements
  2. Automatic Setup: Model manager handles loading and optimization
  3. Real-time Inference: Process stereo pairs with genuine AI acceleration
  4. Quality Monitoring: Live performance metrics and quality assessment
  5. Fallback Support: Seamless fallback to traditional methods if needed

🔧 Technical Implementation

  • ONNX Runtime 1.15+: Industry-standard neural inference engine
  • TensorRT Integration: NVIDIA GPU optimization for maximum performance
  • Smart Memory Management: Efficient model caching and memory optimization
  • Error Recovery: Robust error handling with graceful degradation
  • Cross-Platform Support: Windows, Linux, macOS with unified API

Recommendation: Use HITNet for real-time applications, RAFT-Stereo for maximum accuracy, or CREStereo for balanced performance.

⚡ Live Stereo Processing

Real-time Live 3D Performance GPU Accelerated

Real-time stereo vision processing with live disparity mapping and 3D point cloud generation:

🎯 Features

  • Real-time Disparity Maps: Live computation during webcam capture
  • 3D Point Cloud Generation: Instant 3D reconstruction with color mapping
  • Performance Monitoring: Live FPS tracking and queue management
  • GPU Acceleration: Automatic CUDA/HIP acceleration with CPU fallback
  • Interactive Parameters: Real-time adjustment of processing parameters
  • Quality Indicators: Live feedback on processing quality and performance

🎮 Usage

  1. Complete Calibration: Ensure cameras are calibrated (manual or AI)
  2. Start Live Processing: Process → Toggle Live Processing (Ctrl+Shift+P)
  3. View Live Results: Switch to "Live Processing" tab for real-time view
  4. Monitor Performance: Watch FPS and quality metrics in status bar
  5. Adjust Parameters: Use parameter panel for real-time fine-tuning

📷 Webcam Capture Integration

Live Capture Device Selection Stereo Sync Real-time Preview

The application now supports direct webcam capture for real-time stereo vision processing:

🎯 Features

  • Camera Device Detection: Automatically detect available USB and built-in cameras
  • Dual Camera Setup: Configure separate left and right camera devices
  • Single Camera Mode: Use one camera for manual stereo capture (move camera between shots)
  • Live Preview: Real-time preview from both cameras simultaneously
  • Synchronized Capture: Capture perfectly timed stereo image pairs
  • Device Testing: Test camera connections before starting capture
  • Flexible Configuration: Support for different camera resolutions and frame rates
  • Robust Error Handling: Clear feedback on connection issues and permissions

🎮 Usage

  1. Select Cameras: Use File → Select Cameras... to configure camera devices

    • Choose different cameras for left and right channels for true stereo
    • OR choose the same camera for both to enable single camera manual stereo mode
    • Test camera connections with live preview
    • Configure camera parameters if needed
  2. Start Live Capture: Use File → Start Webcam Capture (Ctrl+Shift+S)

    • Live preview appears in the image display tabs
    • Dual camera mode: Both cameras stream at ~30 FPS
    • Single camera mode: Same feed shows in both panels for manual positioning
    • Real-time feedback on capture status
  3. Capture Images: Multiple capture options available

    • Capture Left Image (L key): Save current camera frame as left image
    • Capture Right Image (R key): Save current camera frame as right image
    • Capture Stereo Pair (Space key): Save synchronized stereo pair
    • Single Camera: Move camera between left/right captures for stereo pairs
  4. Stop Capture: Use File → Stop Webcam Capture (Ctrl+Shift+T)

⌨️ Keyboard Shortcuts

  • Ctrl+Shift+C: Open camera selector dialog
  • Ctrl+Shift+S: Start webcam capture
  • Ctrl+Shift+T: Stop webcam capture
  • L: Capture left image (during capture)
  • R: Capture right image (during capture)
  • Space: Capture synchronized stereo pair

🔧 Technical Details

  • Supported Formats: PNG, JPEG, BMP, TIFF for captured images
  • Frame Rate: Up to 30 FPS live preview (hardware dependent)
  • Resolution: Automatic detection of optimal camera resolution
  • Synchronization: Frame-level synchronization for stereo pairs
  • File Naming: Automatic timestamped file naming for captured images

🖱️ Mouse Controls

  • Left Mouse + Drag: Rotate view around the point cloud
  • Right Mouse + Drag: Pan the camera view
  • Mouse Wheel: Zoom in/out with smooth scaling
  • Double Click: Reset view to default position

⌨️ Keyboard Shortcuts

  • R: Reset view to default position
  • 1: Front view
  • 2: Side view
  • 3: Top view
  • A: Toggle auto-rotation animation
  • G: Toggle grid display
  • X: Toggle coordinate axes

🔧 Advanced Features

🔇 Noise Suppression

  • Statistical Outlier Removal: Removes noisy points based on statistical analysis
  • Voxel Grid Filtering: Downsamples point cloud to reduce noise and improve performance
  • Radius Outlier Removal: Removes isolated points based on neighborhood density
  • Real-time Preview: See filtering effects immediately
  • Adjustable Parameters: Fine-tune filtering strength

🎨 Visualization Modes

  • RGB Color Mode: Display original colors from stereo cameras
  • Depth Color Mode: Color-code points by distance (blue=near, red=far)
  • Height Color Mode: Color-code points by Y-coordinate
  • Intensity Mode: Grayscale visualization based on brightness

🚀 Performance Options

  • Quality Levels: Fast/Medium/High rendering quality
  • Smooth Shading: Enhanced visual quality with lighting
  • Adaptive Point Size: Automatically adjust point size based on distance
  • Level-of-Detail: Optimize rendering for large point clouds

📊 Real-time Statistics

  • Point Count: Total number of points in cloud
  • Depth Range: Minimum and maximum depth values
  • Noise Level: Percentage of potentially noisy points
  • Bounding Box: 3D dimensions of the point cloud
  • Memory Usage: Real-time memory consumption

💾 Export Options

  • PLY Format: Binary and ASCII variants
  • PCD Format: Point Cloud Data format
  • XYZ Format: Simple coordinate format
  • Image Export: Save current view as image
  • Video Recording: Capture rotating animations

🎯 Use Cases

  • 3D Reconstruction: Build detailed 3D models from stereo images
  • Robotics: Navigation and obstacle detection
  • AR/VR: Content creation for immersive experiences
  • Research: Academic and industrial computer vision projects
  • Quality Control: Dimensional analysis and inspection

GPU Support

NVIDIA AMD CPU

This project supports both NVIDIA and AMD GPUs:

  • NVIDIA GPUs: Uses CUDA for acceleration
  • AMD GPUs: Uses ROCm/HIP for acceleration
  • CPU Fallback: Automatic fallback to CPU-only mode if no GPU is detected

Technology Stack

C++17 CMake OpenCV Qt6 PCL VTK ONNX Runtime TensorRT

Boost Eigen OpenGL GLFW spdlog Modern C++

Performance Characteristics

FPS Latency Memory GPU Speedup Point Cloud Accuracy

Dependencies

Ubuntu Windows macOS

Required Libraries

  • OpenCV (>= 4.5): Computer vision and image processing with CUDA/OpenCL support
  • PCL (Point Cloud Library >= 1.12): Point cloud processing and visualization
  • Qt6 (>= 6.0): GUI framework with modern Windows 11 styling
  • VTK (>= 9.0): Visualization toolkit (dependency of PCL)
  • CMake (>= 3.18): Build system with AI/ML integration support

AI/ML Runtime Libraries

  • ONNX Runtime (>= 1.15): Neural network inference engine for stereo matching
  • TensorRT (>= 8.5, Optional): NVIDIA GPU acceleration for neural models
  • OpenCV DNN (>= 4.5): Deep neural network support for enhanced stereo vision

GPU Runtime (Optional)

  • NVIDIA: CUDA Toolkit (>= 11.0) with TensorRT for optimal neural model performance
  • AMD: ROCm (>= 5.0) with HIP support for GPU acceleration

Installation

📋 Complete setup instructions for Ubuntu, Windows, and macOS are available in docs/SETUP_REQUIREMENTS.md

Quick Setup (Ubuntu/Debian)

For NVIDIA GPUs:

# Run the main setup script (auto-detects NVIDIA)
./setup_dev_environment.sh

For AMD GPUs:

# First run basic setup
./setup_dev_environment.sh

# Then run AMD-specific setup
./setup_amd_gpu.sh

Manual Installation (Ubuntu):

# Install OpenCV
sudo apt update
sudo apt install libopencv-dev

# Install PCL and VTK
sudo apt install libpcl-dev libvtk9-dev

# Install Qt6
sudo apt install qt6-base-dev qt6-opengl-dev qt6-opengl-widgets-dev

# Install additional dependencies
sudo apt install libboost-all-dev libeigen3-dev libglew-dev

# For NVIDIA: Install CUDA (follow NVIDIA's official guide)
# For AMD: Install ROCm (see setup_amd_gpu.sh)

Building

Build System Build Status Cross Compile

Quick Build (Auto-detection)

# Auto-detects GPU and builds accordingly
./build.sh

Build Scripts Available

  • ./run.sh - Build and run with GUI (default, at project root)
  • ./build_scripts/build.sh - Build only
  • ./build_scripts/build_amd.sh - AMD/HIP specific build
  • ./build_scripts/build_debug.sh - Debug build with symbols

Manual Build with GPU Backend Selection

For NVIDIA GPUs with AI/ML:

mkdir build && cd build
cmake .. -DUSE_CUDA=ON -DUSE_HIP=OFF -DWITH_ONNX=ON -DWITH_TENSORRT=ON
make -j$(nproc)

For AMD GPUs with AI/ML:

mkdir build && cd build
cmake .. -DUSE_CUDA=OFF -DUSE_HIP=ON -DWITH_ONNX=ON -DWITH_TENSORRT=OFF
make -j$(nproc)

CPU-only with Neural Networks:

mkdir build && cd build
cmake .. -DUSE_CUDA=OFF -DUSE_HIP=OFF -DWITH_ONNX=ON -DWITH_TENSORRT=OFF
make -j$(nproc)

Usage

Running the Application

./stereo_vision_app

Camera Calibration

  1. Print a checkerboard pattern (9x6 recommended)
  2. Capture calibration images with both cameras
  3. Use the calibration tool to compute camera parameters

Point Cloud Generation

  1. Load calibrated camera parameters
  2. Capture or load stereo image pairs
  3. Adjust stereo matching parameters
  4. Generate and export point cloud

📁 Project Structure

Organization Documentation Tests Scripts

computer-vision/                # 🎯 Clean, modern project structure with AI/ML integration
├── 📁 src/                     # Source code
│   ├── core/                   # Core algorithms (stereo, calibration)
│   ├── ai/                     # Neural network implementations (Enhanced Neural Matcher)
│   ├── gui/                    # Qt GUI components with modern Windows 11 styling
│   ├── gpu/                    # GPU acceleration (CUDA/HIP)
│   ├── multicam/               # Multi-camera system
│   └── utils/                  # Utility functions
├── 📁 include/                 # Header files (mirrors src/)
│   ├── ai/                     # Neural stereo matching (enhanced_neural_matcher.hpp)
│   ├── gui/                    # GUI component headers
│   ├── multicam/               # Multi-camera headers
│   └── benchmark/              # Performance benchmarking
├── 📁 tests/                   # Unit and integration tests
├── 📁 test_programs/           # 🧪 Standalone test utilities
│   └── README.md               # Test program guide
├── 📁 documentation/           # 📖 Organized project documentation
│   ├── features/               # Feature implementation docs
│   ├── build/                  # Build system documentation
│   └── setup/                  # Environment setup guides
├── 📁 build_scripts/           # ⚙️ Build and utility scripts
│   ├── build*.sh               # Various build configurations
│   ├── setup*.sh               # Environment setup scripts
│   └── README.md               # Script documentation
├── 📁 reports/                 # 📊 Generated reports and benchmarks
│   └── benchmarks/             # Performance benchmark results
├── 📁 archive/                 # � Historical documentation and temp files
│   ├── milestone_docs/         # Completed milestone documentation
│   └── temp_tests/             # Completed Priority 2 test implementations
├── 📁 data/                    # Sample data and calibration files
├── 📁 docs/                    # Technical documentation
├── 📁 logs/                    # 📋 Build and runtime logs
├── 📁 scripts/                 # Utility scripts
├── 📁 cmake/                   # CMake modules
├── 📄 CMakeLists.txt           # Build configuration
├── 📄 README.md                # This file (modern, comprehensive)
├── 📄 PROJECT_MODERNIZATION_STRATEGY.md # Modernization roadmap
└── 🚀 run.sh                   # Main build and run script

📂 Quick Navigation

🏆 Latest Achievements

✅ Latest AI/ML Enhancements (Just Completed)

  • 🧠 Enhanced Neural Matcher - Advanced AI stereo matching with multiple model support
  • 🚀 ONNX Runtime Integration - Real neural network inference replacing placeholder implementations
  • 🎯 Multiple Model Support - HITNet, RAFT-Stereo, CREStereo with adaptive selection
  • ⚡ TensorRT Optimization - Optional GPU acceleration for maximum performance
  • 🔧 Smart Model Management - Automatic model loading, validation, and fallback handling

🎯 AI/ML Technical Achievements

  • Enhanced Neural Matcher: Real ONNX Runtime integration with production-ready inference
  • Model Architecture Support: HITNet (high-speed), RAFT-Stereo (accuracy), CREStereo (balanced)
  • Adaptive Backend: TensorRT optimization with CPU/GPU fallback handling
  • Professional API: Clean C++ interface with comprehensive error handling and logging

✅ Priority 2 Features Complete (Previously Completed)

  • Neural Network Stereo Matching - TensorRT/ONNX backends with adaptive optimization
  • Multi-Camera Support - Synchronized capture and real-time processing
  • Professional Installers - Cross-platform packaging framework
  • Enhanced Performance Benchmarking - Comprehensive testing with HTML/CSV reports

See archive/milestone_docs/PRIORITY2_COMPLETE.md for full details.

📊 Performance Highlights

  • Enhanced Neural Models: Real-time inference with ONNX Runtime optimization
  • Neural Networks: 274 FPS (StereoNet), 268 FPS (PSMNet)
  • Multi-Camera: 473 FPS (2 cameras), 236 FPS (4 cameras)
  • Latest Reports: Available in reports/benchmarks/

About

A high-performance C++ application for generating 3D point clouds from stereo camera images using GPU acceleration (CUDA for NVIDIA or HIP for AMD GPUs).

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published