Matlab CUDA feval interfacing problem

Hello,
I have trouble interfacing my CUDA kernel with the calling Matlab process.
I do have the following error message when I have too big amount of thread :
Error using parallel.gpu.CUDADevice/wait An unexpected error occurred during CUDA execution. The CUDA error was: an illegal memory access was encountered
Error in diffuseBlackAndScholesDividendNull_CUDA (line 36) wait(gpuDevice)
The Matlab and CUDA related code execute well if the n_row = 1:1000, but no more if the n_row=1:2000.
The only explaination that I may have is that the amount of memory from the graphic card may be overflown. But I am not sure. The error message is not very clear to me.
This is my CUDA graphic card configuration :
>> gpuDevice
ans =
CUDADevice with properties:
Name: 'GeForce GTX 680'
Index: 1
ComputeCapability: '3.0'
SupportsDouble: 1
DriverVersion: 8
ToolkitVersion: 8
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
SIMDWidth: 32
TotalMemory: 2.1475e+09
AvailableMemory: 1.3943e+09
MultiprocessorCount: 8
ClockRateKHz: 1124000
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 1
CanMapHostMemory: 1
DeviceSupported: 1
DeviceSelected: 1
This is my Cuda code Kernel code 'diffusionFunctions.cu' :
#include "cuda_runtime.h" #include "device_launch_parameters.h" #define _CUDA_INTERNAL_COMPILATION_ #include "math_functions.h" #undef _CUDA_INTERNAL_COMPILATION_
#include stdio.h #include iostream
_global_ void diffuseBlackAndScholesDividendNull(double * result, const double* S_0, const double* ExpIncAndPerturbationMatrix, const int NTimes) { unsigned long i = blockDim.x * blockIdx.x + threadIdx.x; long offset = NTimes * i; result[offset] = S_0[i]; for (int i_t = 0; i_t < NTimes-1; i_t++) { result[offset + i_t + 1] = result[offset + i_t] * ExpIncAndPerturbationMatrix[offset + i_t + 1]; } }
And this is my test code :
n_t = 3653; min_div = 0.05; SpotRef = 2.9285e+03; n_row = 1:2000; n_row_max = length(n_row); item_number = n_row_max * n_t; ExpIncAndPerturbationMatrix = rand(n_row_max,n_t); ExpIncAndPerturbationMatrix = double(ExpIncAndPerturbationMatrix(n_row,1:n_t)); S_t = rand(n_row_max,n_t); S0 = S_t(:, 1); S_t = S_t(n_row,1:n_t);
S_t_cpu_result_not_flatten = S_t; tic; for i_t =2:n_t % -- S_t+1 S_t_cpu_result_not_flatten(:,i_t) = S_t_cpu_result_not_flatten(:,i_t-1) .* ExpIncAndPerturbationMatrix(:,i_t); end toc;
k = parallel.gpu.CUDAKernel('diffusionFunctions.ptx', 'diffusionFunctions.cu','diffuseBlackAndScholesDividendNull'); thread_number = n_row_max; if thread_number > k.MaxThreadsPerBlock, thread_number = k.MaxThreadsPerBlock; end k.ThreadBlockSize = [thread_number, 1, 1]; k.GridSize = [ceil(n_row_max / k.ThreadBlockSize(1)), 1];
tic; ExpIncAndPerturbationMatrix_flatten = transpose(ExpIncAndPerturbationMatrix); ExpIncAndPerturbationMatrix_flatten = ExpIncAndPerturbationMatrix_flatten(1:item_number); S0 = transpose(S0); ExpIncAndPerturbationMatrix_flatten_gpu = gpuArray(ExpIncAndPerturbationMatrix_flatten); S0_gpu = gpuArray(S0); result = gpuArray.zeros(1,item_number); result = feval(k, result, S0, ExpIncAndPerturbationMatrix_flatten, n_t); wait(gpuDevice) S_t_gpu_result = gather(result); toc; S_t_cpu_result_not_flatten = transpose(S_t_cpu_result_not_flatten); diff = S_t_gpu_result - S_t_cpu_result_not_flatten(1:item_number); isequal(diff , zeros(size(diff)))

1 Commento

Hi, can you format your code correctly using the Code formatting button so it's possible to see what's going on? Thanks.

Accedi per commentare.

Risposte (1)

Joss Knight
Joss Knight il 15 Lug 2017
Modificato: Joss Knight il 15 Lug 2017
You are just writing off the end of your array. Your logic says that when you have 2000 rows, launch two blocks each with 1024 threads. So what happens in your kernel when you are in block 2, thread 76?
You need to pass n_row_max into your kernel and guard against this illegal access. This is very common for kernels like yours.

Modificato:

il 15 Lug 2017

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by