optimal way of speeding up this execution using GPU array

Question

Vivek il 2 Ott 2015

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/246494-optimal-way-of-speeding-up-this-execution-using-gpu-array

Commentato: Vivek il 2 Ott 2015

Hi - please forgive the naivety of this question, I am just getting familiar with GPU computing and am working through this slowly.

I have a system, that takes in an Nx1 vector of states, and an objective function that performs many arithmetic operations to output an Nx1 vector of outputs. I would like to repeat this for a large number of independent vectors representing different objects (say, 100,000). In general, each element of the Nx1 vector depends on others within the vector to produce the output, but does not depend on elements from other vectors. Instead of serial computation with a for loop, is there a good way to parallelize this on the GPU?

To clarify (if needed):

function yprime=compderiv(y)
%constants alpha, beta, gamma
yprime(1)=alpha*(y(2)-y(1));
yprime(2)=beta*(y(2));
yprime(3)=(y(1)+y(2)+y(3))/gamma;
 .....
yprime(72)=....somecalculation;
return
for i=1:100000
y=rand(72,1); %not really random data, but just 
              %to make the point that this loop is independent of prior iterations
yprime(:,i)=compderiv(y);
end

Naively, I thought of passing the elements of my Nx1 vector individually to a function to compute individual outputs, and then using arrayfun, but N is large (>70), and it is cumbersome to pass that many variables to it:

   function [yp1,yp2,yp3...yp72]=tempfun(y1,y2,y3...y72)
    ....
   return
    y1=gpuArray.rand(1,100000); 
    y2=gpuArray.rand(1,100000);   
    y3=gpuArray.rand(1,100000);
      ...
    y72=gpuArray.rand(1,100000);
    [ypp1,ypp2,ypp3...ypp72]=arrayfun(@tempfun, y1,y2,y3....,y72);

Is there a more elegant solution to this kind of problem?

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Matt J il 2 Ott 2015

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/246494-optimal-way-of-speeding-up-this-execution-using-gpu-array#answer_194521

Modificato: Matt J il 2 Ott 2015

Apri in MATLAB Online

All of the yprime expressions that you've shown are linear combinations of the y(i). Is that the case for all (or most) of them? If so, you're barking up the wrong tree by splitting all your equations up into 72 separate scalar equations. You should use sparse matrix multiplication to vectorize the computations and be done with it.

Regardless of that, passing 72 variables to arrayfun doesn't need to be all that cumbersome

    y=num2cell( gpuArray.rand(72,100000) ,2); 
    [ypp{1:72}]=arrayfun(@tempfun, y{:});

However, I wonder if a PARFOR loop isn't better for this than the GPU.

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Vivek il 2 Ott 2015

Unfortunately they are not linear combinations, the posted code was heavily edited for simplicity :) But thanks for looking at it. Nice shortcut to pass the individual variables, perhaps that is the best solution - my GPU is a pretty nice Tesla, so I will see how it times out.

Accedi per commentare.

optimal way of speeding up this execution using GPU array

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposta accettata

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Più risposte (0)

Vedere anche

Categorie

Tag

Prodotti

Community Treasure Hunt

optimal way of speeding up this execution using GPU array

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposta accettata

1 Commento Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Più risposte (0)

Vedere anche

Categorie

Tag

Prodotti

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti