Main Content

Benchmarking

Add benchmarking to the generated code

Description

App Configuration Pane: GPU Code

Configuration Objects: coder.gpuConfig

The Benchmarking parameter controls addition of benchmarking code to the generated CUDA® code.

After execution, the generated benchmarking code creates the gpuTimingData comma separated values (CSV) file in the current working folder. The CSV file contains timing data for kernel, memory, and other events. This table describes the format of the CSV file.

Event TypeFormat

CUDA kernels

<name_N>,<block dimension>,<grid dimension>,<execution time in ms>,<name of parent>

N is the nth execution of the kernel. <block dimension> represents the total block dimension. For example is block dimension is dim3(32,32,32), then the <block dimension> value is 32768.

CUDA memory copy

<name_N>,<memory copy size>,<execution time in ms>,<IO flag>,<name of parent>

N is the nth execution of the memory copy.

Miscellaneous

<name_N>,<execution time in ms>,<name of parent>

N is the nth execution of the operation.

Settings

off (default) | on

Off

Does not generate CUDA code with benchmarking functionality.

On

Generates CUDA code with benchmarking functionality. This option uses CUDA APIs such as cudaEvent to time kernel, memcpy, and other events.

Programmatic Use

Property: Benchmarking
Values: true | false
Default: false

Version History

Introduced in R2018a