# GPU utilization and parallel computation With Matlab for heavy computation

20 views (last 30 days)
caesar on 2 Jul 2019
Commented: caesar on 9 Jul 2019
I have decent/ok machine with core i7 (8 cores), 32G of RAM and Nvidia geForce GTX 1080i and running Matlab 2018b. At the moment I am a bit confuse about how to use these resources in best way to run my Monte-Carlo simulation code. The two questions I have now:
1- How can I make all the heavy computaion to be run on the GPU alonside parallel compution capability of Matlab rather than the CPU and hence I can decide what is best to use? I have read different help topics and the conclusion I think I have got is, the data I have to work with should be in the form of gpuArray Am I right? or do I miss something here?. let us assume that I have the foollowing simple code to be run on GPU :
First_Vector=zeros(2,3);
% First_Vector=zeros(2,3,'gpuArray'); 1
[N,M]=size(First_Vector);
%[N,M]=size(First_Vector,'gpuArray'); 2
Second_Matrix=ones(N,M,2);
%Second_Matrix=ones(N,M,2,'gpuArray'); 3
Tset1= [20 20 20:30 30 30];
%Tset1gpuArray=gpuArray(Tset1); 4
Test2= [50 50 50;60 60 60];
%Tset2=gpuArray(Tset2); 5
K=100;
% the main code
for i=1:3
for j=1:3
[element]=Function1(test(i,j),K)
Test1(i,j)=element;
end
end
Second_Matrix(:,:,1)=Test1;
[Test1]=Function2(Test1,Test2);
% End of the main code
%% Function 1
function[outcome]=Function1(A,K)
outcome=A+K;
end
%%Function 2
function[T1]=Function2(T1,T2)
T1=T1+T2;
end
does the commented lines (1-5) are enough to run the 'main code' on the GPU?
2- I have tested the following simple code on GPU and CPU, CPU performance was by far better than GPU. is that supposed to be normal ?
G = ones(10,10,'gpuArray');
tic
for k=1:100
for i=1: 1000
for j=1:10
G(j,:)=G(j,:)+2;
end
end
end
toc
G = ones(10,10);
tic
for k=1:100
for i=1: 1000
for j=1:10
G(j,:)=G(j,:)+2;
end
end
end
toc
% Elapsed time is 0.628241 seconds.

Andrea Picciau on 3 Jul 2019
Edited: Andrea Picciau on 3 Jul 2019
1. Yes! Isn't that great?
2. Yes, because there are two problems with your code: (a) you're using a lot of for loops instead vector operations and (b) you're measuring GPU performance incorrectly. To fix (a), you should read this doc page that explains how to vectorize your code to get the best performance. To fix (b), you should take a look at my answer to a previous question and use the functions timeit and gputimeit.
caesar on 9 Jul 2019
thanks again Andrea Picciau. your example is so elaborative and clear for me. As I mentioned before, i think there is no way for me to verctoris my code especially when its very complicated and includes call for different functions in iterative fashion.
the annoying thing now is that I cant make exploit the resources that I have (powerfull PC and cluster ) to the max .
regards