Skip to content

Online appendix to the scientific article "Roofline-aware DVFS for GPUs"

Notifications You must be signed in to change notification settings

tue-es/rooflineDVFS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Roofline-aware DVFS for GPUs

Date: 16-Oct-2013

Author: Cedric Nugteren (http://www.cedricnugteren.nl)

Description: This repository is an online appendix to the scientific article "Roofline-aware DVFS for GPUs"

Benchmarks

Three types of CUDA benchmarks are tested:

  • Benchmarks from PolyBench/GPU
  • Benchmarks from Parboil (requires Parboil datasets to be installed in ~/software/parboil-2.5/datasets/)
  • Two artificial micro-benchmarks

Experimental setup

GPGPU-Sim version 3.2.1 + GPUWattch

(commit 72aaaf6b11b38121d946469f26d85315ff794f29)

Configuration for GPGPU-Sim

  • Clock frequencies:

    -gpgpu_clock_domains XXX:YYY:XXX:ZZZ
    

    XXX is the halved core frequency (600-500-400-300). YYY is the full core frequency (1200-1000-800-600). ZZZ is the memory frequency (900-750-600-450).

  • DRAM latencies:

    -dram_latency XXX
    

    XXX is the DRAM latency is core clock cycles, reduced when scaling the core frequency to keep the latency (in seconds) constant (100-83-76-50).

Configuration for GPUWattch

  • Memory configuration:

    <param name="mc_clock" value="XXX"/>
    <param name="peak_transfer_rate" value="YYY"/>
    

    XXX is the doubled memory clock or the halved effective clock (1800-1500-1200-900). YYY is the bandwidth per memory controller (28800-24000-19200-14400).

  • Clock frequencies:

    <param name="target_core_clockrate" value="XXX"/>
    <param name="clockrate" value="XXX"/>
    <param name="NOC_A" value="XXX" />
    

    XXX is either the halved or full core clock frequency in various places in the configuration settings.

  • Memory power parameters:

    <param name="MEM_RD" value="XXX" />
    <param name="MEM_WR" value="YYY" />
    <param name="MEM_PRE" value="ZZZ" />
    

    XXX, YYY, and ZZZ are scaled with the core clock rate to obtain correct memory power characteristics. This has been acknowledge to be a bug in the simulator and will be repaired in the next version.

Contents of the repository

  • benchmark_code

    Folder containing CUDA source code made suitable for the GPGPU-Sim simulator.

  • configurations

    All the GPGPU-Sim and GPUWattch configuration files.

  • results

    Folder containing the graphs as they appear in the article plus more detailed graphs. It also contains a processed database extracted from simulation data.

  • simulation_data

    The raw simulation output from GPGPU-Sim and GPUWattch.

  • process.r

    An R-script to process the raw simulation data and output a database in CSV format (in results folder).

  • graph.r

    An R-script to generate plots based on the database generated by the process.r script.

  • README

    This file.

###################################################

About

Online appendix to the scientific article "Roofline-aware DVFS for GPUs"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published