Info

Questa domanda è chiusa. Riaprila per modificarla o per rispondere.

Matrix multiplication bug in GPU

1 visualizzazione (ultimi 30 giorni)
Nikos Pitsianis
Nikos Pitsianis il 1 Lug 2014
Chiuso: MATLAB Answer Bot il 20 Ago 2021
I am using 8.2.0.701 (R2013b) on a host with 64 AMD cores and 2 K20c GPUs. Driver version 331.62 on Ubuntu 12.04.4 LTS.
$ uname -a
Linux leibniz3 3.5.0-44-generic #67~precise1-Ubuntu SMP Wed Nov 13 16:16:57 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
The matrix multiplication on the GPU returns results that differ substantially from the CPU for matrices of size 2^13x2^13.
To replicate, simply run
clear
n = 2^13;
A = rand(n);
B = rand(n);
tic
C = A * B;
t = toc; fprintf('CPU time %f sec\n',t)
%%One GPU
gpuDevice(1); % reset device
tic;
Ag = gpuArray(A);
Bg = gpuArray(B);
C1 = gather(Ag * Bg);
t = toc; fprintf('1 GPU time %f sec\n',t)
%%Two GPUs
gpuDevice(1); % reset device
gpuDevice(2); % reset device
tic
cc = cell(2,1);
parfor i = 1:2
dev = gpuDevice;
% fprintf('Iter %d Device %d\n',i,dev.Index);
Ag = gpuArray(A);
Bg = gpuArray(B(:,(i-1)*n/2+1:i*n/2));
cc{i} = gather(Ag * Bg);
end
C2 = [cc{1} cc{2}];
t = toc; fprintf('2 GPU time %f sec\n',t)
fprintf('n = %5d %f %f\n', n, ...
max(max(abs(C - C1))), max(max(abs(C - C2))))
The error is substantial. Is this known behavior?
The code works for smaller powers of two. 2^13 is the first that causes the bug to show its ugly head. I did not check other values but I will be glad to.
With 1 GPU the difference max(max(abs(C - C1))) is 0.999716 With 2 GPUs the difference max(max(abs(C - C2))) is 134.766785
The difference is very large!
Here are the plots. The second is a zoom, cause due to size the difference was invisible because it seems it is along a boundary.
<<
>>
I will try your suggestions and follow back on this.
  3 Commenti
Edric Ellis
Edric Ellis il 2 Lug 2014
I can't reproduce the problem you're seeing in R2013b - but I have only a single K20c. Can you reproduce the problem using only a single GPU? Which OS are you using? Have you updated to the latest NVIDIA CUDA driver? Are you able to try R2014a (this includes a later version of the CUDA runtime libraries)?
Jill Reese
Jill Reese il 8 Lug 2014
I am also unable to reproduce this on a single K20c in R2013b. I'm running a 12 core Debian machine with GPU driver version 331.62. On my system I see reasonable agreement between the CPU and GPU results:
max(max(abs(C-C1))) = 10^(-11)
As Edric mentioned, are you able to try R2014a to see if the problem is still reproducible for you in that version?

Risposte (0)

Questa domanda è chiusa.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by