I'm not sure about this one. Since your code only has one kernel, try removing the third argument to CUDAKernel. Also try recompiling your kernel using mexcuda -ptx addComplexVectors.cu.
Getting started with cuda in matlab and running into errors
40 visualizzazioni (ultimi 30 giorni)
Mostra commenti meno recenti
Simple code example:
%%Test bench adding two complex vectors
npts = 1e6;
a = complex(rand(npts,1,'single'), rand(npts,1,'single'));
b = complex(rand(npts,1,'single'), rand(npts,1,'single'));
c = addComplexVectorsCUDA(a, b);
function c = addComplexVectorsCUDA(a, b)
% addComplexVectorsCUDA Add two complex vectors on the GPU using a custom CUDA kernel.
%
% c = addComplexVectorsCUDA(a, b) returns the element-wise sum of complex vectors a and b.
% The vectors a and b must be of the same size.
%
% This function splits the complex vectors into their real and imaginary parts,
% transfers them to the GPU, launches the CUDA kernel, and then gathers the result.
%
% Note: This example assumes the CUDA kernel has been compiled to
% 'addComplexVectors.ptx' and the source is 'addComplexVectors.cu'.
% Check input sizes
if numel(a) ~= numel(b)
error('Vectors a and b must be the same size.');
end
% Convert inputs to single precision (required for the kernel)
a = single(a);
b = single(b);
N = int32(numel(a)); % number of elements
% Split the complex vectors into real and imaginary parts
aReal = real(a);
aImag = imag(a);
bReal = real(b);
bImag = imag(b);
% Transfer data to the GPU
aRealGPU = gpuArray(aReal);
aImagGPU = gpuArray(aImag);
bRealGPU = gpuArray(bReal);
bImagGPU = gpuArray(bImag);
% Prepare output arrays on the GPU
cRealGPU = gpuArray.zeros(size(aReal), 'single');
cImagGPU = gpuArray.zeros(size(aImag), 'single');
% Load the CUDA kernel. Ensure the PTX and CU file names are correct.
kernel = parallel.gpu.CUDAKernel('addComplexVectors.ptx', 'addComplexVectors.cu', 'addComplexVectors');
% Set thread block size and grid size
threadsPerBlock = 256;
gridSize = ceil(double(N) / threadsPerBlock);
kernel.ThreadBlockSize = threadsPerBlock;
kernel.GridSize = gridSize;
% Launch the kernel. Use multiple outputs to receive both real and imaginary parts.
[cRealGPU, cImagGPU] = feval(kernel, aRealGPU, aImagGPU, bRealGPU, bImagGPU, cRealGPU, cImagGPU, N);
% Retrieve the results from the GPU
cReal = gather(cRealGPU);
cImag = gather(cImagGPU);
% Combine the real and imaginary parts to form the complex result
c = complex(cReal, cImag);
end
Then the .cu code:
// addComplexVectors.cu
extern "C" __global__ void addComplexVectors(const float* aReal, const float* aImag,
const float* bReal, const float* bImag,
float* cReal, float* cImag, int N)
{
int tid = blockIdx.x * blockDim.x + threadIdx.x;
if (tid < N) {
// Add the real parts and the imaginary parts separately
cReal[tid] = aReal[tid] + bReal[tid];
cImag[tid] = aImag[tid] + bImag[tid];
}
}
and the error:
Error using parallel.internal.gpu.handleKernelArgs
The PTX entry point name has not parsed to an equivalent C
name.
Error in parallel.internal.gpu.handleKernelArgs
Error in addComplexVectorsCUDA (line 41)
kernel = parallel.gpu.CUDAKernel('addComplexVectors.ptx', 'addComplexVectors.cu', 'addComplexVectors');
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in GPUCUDATestBenchVectorAdd (line 7)
c = addComplexVectorsCUDA(a, b);
^^^^^^^^^^^^^^^^^^^^^^^^^^^
0 Commenti
Risposte (1)
Vedere anche
Categorie
Scopri di più su GPU Computing in Help Center e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!