Overview

The code in src/main.cu is a contrived data processing pipeline built to highlight NVIDIA's profiling tools and a couple of common pitfalls when getting started with CUDA programming. There are 4 different implementations of the same pipeline: they are all functionally identical, but differ in their relative runtime performance. The implementations are:

Baseline
Launch pipeline with >1 thread blocks
Process pipeline with CUDA streams to prevent unnecessary blocking
Use coalesced memory access (relevant blog post)

A graphical representation of the contrived pipeline:

Building and running

Creating and running using Docker

A Dockerfile, along with bash scripts for building and running the Docker container are located in the deploy directory. To build and run the container, use:

./deploy/build-docker.sh
./deploy/run-docker.sh

Building the executable

The application is built with CMake:

mkdir build && cd build
cmake ..
make

Profiling tools

Run the application with ./nsight-demo.

A script for generating an Nsight Systems report (.nsys-rep) and Nsight Compute reports (.ncu-rep) can be run with ./deploy/profile.sh. The reports will be saved in the ./nsys-reports directory.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
deploy		deploy
include		include
src		src
CMakeLists.txt		CMakeLists.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Building and running

Creating and running using Docker

Building the executable

Profiling tools

About

Uh oh!

Releases

Packages

Languages

dylan-eustice/Nsight-Demo

Folders and files

Latest commit

History

Repository files navigation

Overview

Building and running

Creating and running using Docker

Building the executable

Profiling tools

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages