Skip to content

Parallelization of a stereo-matching filter using OpenMP

Notifications You must be signed in to change notification settings

Sachatms/PPEM-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PPEM Stereo Matching - x86 & C6678 DSP Parallelization

Parallel implementation of a 9-stage stereo matching algorithm using OpenMP on x86 and TI C6678 DSP platforms.

Course: PPEM (Parallel Programming for Embedded Multicore) Institution: INSA Rennes Team Size: 3 members Duration: 3 weeks - 10 hours Target Platforms: x86 (Windows/Linux), TI C6678 8-core DSP

Project Overview

This project parallelizes a depth-from-stereo algorithm that processes stereo camera images to generate depth maps. The algorithm executes 9 sequential stages per frame, with parallelization opportunities in the computationally intensive stages.

Algorithm Pipeline

YUV Read → YUV2RGB → RGB2Gray → Census → Cost Construction (60 disparities)
→ Aggregate Costs → Disparity Select → Median Filter → MD5 Validation → Display

Objectives

  1. Parallelize on x86 using OpenMP (≥2x speedup)
  2. Port to C6678 DSP and parallelize (≥2x speedup)
  3. Cross-platform validation (identical MD5 hashes)
  4. Professional report (≤5 pages, English)
  5. Live demonstration on both platforms

Quick Start

Prerequisites

For x86 development:

  • Windows: Visual Studio 2022, CMake ≥3.15
  • Linux: GCC/Clang with OpenMP, CMake, SDL2 development libraries

For C6678 development:

  • Code Composer Studio (CCS) v12.x
  • TI C6000 compiler tools
  • C6678 hardware access (EVM or lab setup)

Build Instructions

x64 (Windows)

# Generate Visual Studio solution
cmake -G "Visual Studio 17 2022" -B bin

# Build
cmake --build bin --config Release

# Run
cd bin/Release
./stereo.exe

x86 (Linux)

# Generate Makefiles
cmake -B bin

# Build
cmake --build bin

# Run
cd bin
./stereo

C6678 DSP

See docs/issues/epic_04_c6678_sequential/ for detailed CCS project setup instructions.

Project Structure

PPEM-project/
├── src/                    # Algorithm implementation (9 stages)
│   ├── main.c             # Main loop, orchestrates pipeline
│   ├── costConstruction.c # Hotspot #1 for parallelization
│   ├── aggregateCost.c    # Hotspot #2 for parallelization
│   └── ...                # Other algorithm stages
├── include/               # Header files
│   ├── params.h          # Image dimensions, disparity range
│   └── ...               # Function declarations
├── dat/                   # Test stereo image datasets
├── lib/                   # SDL2 libraries (x86 only)
├── bin/                   # Build outputs, generated by CMake
├── CMakeLists.txt        # x86 build configuration
└── README.md             # This file

Development Workflow

1. Pick an Issue

Browse issues in docs/issues/ organized by epic:

  • Epic 1: Foundation (setup, validation)
  • Epic 2: x86 Baseline (profiling, documentation)
  • Epic 3: x86 Parallel (OpenMP implementation)
  • Epic 4: C6678 Sequential (DSP porting)
  • Epic 5: C6678 Parallel (DSP parallelization)
  • Epic 6: Validation & Report (cross-platform, demo, report)

2. Create a Branch

git checkout main
git pull origin main
git checkout -b epic-XX/issue-YY-description

3. Implement & Test

  • Follow subtasks in the issue file (docs/issues/epic_XX_name/issue_YY.md)
  • Test locally: build, run, validate MD5
  • Commit frequently with clear messages

4. Submit Pull Request

  • Use PR template (.github/PULL_REQUEST_TEMPLATE.md)
  • Request code review from ≥1 team member
  • Address feedback, then merge

5. Close Issue & Update Milestone

Key Technologies

  • OpenMP: Parallel programming API (thread-level parallelism)
  • CMake: Cross-platform build system generator
  • SDL2: Simple DirectMedia Layer (visualization on x86)
  • TI C6678: 8-core DSP (1.25 GHz C66x cores, 4 MB L2 cache)
  • Code Composer Studio: TI's IDE for embedded development

Performance Targets

Platform Sequential Time Parallel Time Speedup Target
x86 (8 cores) Baseline TBD ≥2.0x Required
C6678 (8 cores) Baseline TBD ≥2.0x Required

Critical validation: MD5 hashes must match across:

  • Sequential vs. parallel (same platform)
  • x86 vs. C6678 (cross-platform)

Milestones

  • M1 - Foundation Complete (Week 1, Dec 1)

    • Development environments working
    • Sequential code validated with baseline MD5
  • M2 - x86 Parallelization Complete (Week 2-3, Dec 12)

    • x86 parallelized with ≥2x speedup
    • MD5 validation passing
  • M3 - C6678 Sequential Port Complete (Week 2-3, Dec 12)

    • C6678 compiles and runs sequentially
    • MD5 matches x86 baseline
  • M4 - C6678 Parallelization Complete (Week 3-4, Dec 19)

    • C6678 parallelized with ≥2x speedup
    • MD5 validation passing
  • M5 - Project Delivery (Week 4-5, Dec 27)

    • Report complete (≤5 pages)
    • Demo rehearsed and successful
    • Code packaged for submission

See docs/project_milestones.md for detailed timeline.

Testing & Validation

MD5 Checksum Validation

The algorithm computes an MD5 hash of the output depth map for correctness validation.

Baseline MD5: Stored in bin/md5.txt after first sequential run (Issue #4)

Validation rules:

  • Parallel code must produce identical MD5 as sequential
  • C6678 must produce identical MD5 as x86
  • Use schedule(static) in OpenMP to ensure determinism

About

Parallelization of a stereo-matching filter using OpenMP

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages