Getting started with cuda in matlab and running into errors

40 visualizzazioni (ultimi 30 giorni)
Simple code example:
%%Test bench adding two complex vectors
npts = 1e6;
a = complex(rand(npts,1,'single'), rand(npts,1,'single'));
b = complex(rand(npts,1,'single'), rand(npts,1,'single'));
c = addComplexVectorsCUDA(a, b);
function c = addComplexVectorsCUDA(a, b)
% addComplexVectorsCUDA Add two complex vectors on the GPU using a custom CUDA kernel.
%
% c = addComplexVectorsCUDA(a, b) returns the element-wise sum of complex vectors a and b.
% The vectors a and b must be of the same size.
%
% This function splits the complex vectors into their real and imaginary parts,
% transfers them to the GPU, launches the CUDA kernel, and then gathers the result.
%
% Note: This example assumes the CUDA kernel has been compiled to
% 'addComplexVectors.ptx' and the source is 'addComplexVectors.cu'.
% Check input sizes
if numel(a) ~= numel(b)
error('Vectors a and b must be the same size.');
end
% Convert inputs to single precision (required for the kernel)
a = single(a);
b = single(b);
N = int32(numel(a)); % number of elements
% Split the complex vectors into real and imaginary parts
aReal = real(a);
aImag = imag(a);
bReal = real(b);
bImag = imag(b);
% Transfer data to the GPU
aRealGPU = gpuArray(aReal);
aImagGPU = gpuArray(aImag);
bRealGPU = gpuArray(bReal);
bImagGPU = gpuArray(bImag);
% Prepare output arrays on the GPU
cRealGPU = gpuArray.zeros(size(aReal), 'single');
cImagGPU = gpuArray.zeros(size(aImag), 'single');
% Load the CUDA kernel. Ensure the PTX and CU file names are correct.
kernel = parallel.gpu.CUDAKernel('addComplexVectors.ptx', 'addComplexVectors.cu', 'addComplexVectors');
% Set thread block size and grid size
threadsPerBlock = 256;
gridSize = ceil(double(N) / threadsPerBlock);
kernel.ThreadBlockSize = threadsPerBlock;
kernel.GridSize = gridSize;
% Launch the kernel. Use multiple outputs to receive both real and imaginary parts.
[cRealGPU, cImagGPU] = feval(kernel, aRealGPU, aImagGPU, bRealGPU, bImagGPU, cRealGPU, cImagGPU, N);
% Retrieve the results from the GPU
cReal = gather(cRealGPU);
cImag = gather(cImagGPU);
% Combine the real and imaginary parts to form the complex result
c = complex(cReal, cImag);
end
Then the .cu code:
// addComplexVectors.cu
extern "C" __global__ void addComplexVectors(const float* aReal, const float* aImag,
const float* bReal, const float* bImag,
float* cReal, float* cImag, int N)
{
int tid = blockIdx.x * blockDim.x + threadIdx.x;
if (tid < N) {
// Add the real parts and the imaginary parts separately
cReal[tid] = aReal[tid] + bReal[tid];
cImag[tid] = aImag[tid] + bImag[tid];
}
}
and the error:
Error using parallel.internal.gpu.handleKernelArgs
The PTX entry point name has not parsed to an equivalent C
name.
Error in parallel.internal.gpu.handleKernelArgs
Error in addComplexVectorsCUDA (line 41)
kernel = parallel.gpu.CUDAKernel('addComplexVectors.ptx', 'addComplexVectors.cu', 'addComplexVectors');
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in GPUCUDATestBenchVectorAdd (line 7)
c = addComplexVectorsCUDA(a, b);
^^^^^^^^^^^^^^^^^^^^^^^^^^^

Risposte (1)

Joss Knight
Joss Knight il 18 Apr 2025

I'm not sure about this one. Since your code only has one kernel, try removing the third argument to CUDAKernel. Also try recompiling your kernel using mexcuda -ptx addComplexVectors.cu.

Prodotti


Release

R2024b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by