Is there a way to speed up for loop when grouped with GPU?
2 visualizzazioni (ultimi 30 giorni)
Mostra commenti meno recenti
Hello,
At the moment for loop is bottle neck in my code. I know that GPU does not work with indexing and due to for loop all calculations are switching memories between GPU and CPU. But maybe someone would have a suggestion how to speed up this part or this is stalemate and due to memory switching cant not optimized more. In my case lab (200000000x130), dydis (100000).
function [a,b]=skaicia (lab,dydis,z)
comi=gpuArray(0.05);
t=gpuArray(0.6);
d=gpuArray(50001);
langas=gpuArray(50000);
atidaryta=gpuArray(50000);
x1 = zeros(dydis,65,1);
for i=1:z
x1(:,:,i)=lab(i*dydis+1-dydis:i*dydis,:);
x=gpuArray(x1(:,:,i));
x23=x(1:end-d,:);
[n1,n2]=size(x);
n1=gpuArray(n1);
n2=gpuArray(n2);
xt=permute(x,[2 1 3]);
dx1=(d-langas-1:d-2);
dx=permute(dx1,[2 1])+ (1:n1-d);
[sujn1(:,:,i),sujn2(:,:,i)]=mazinta(xt,dx,n2,n1,d,x23,t,langas,atidaryta,comi);
end
a=sujn1;
b=sujn2;
end
2 Commenti
Walter Roberson
il 31 Mar 2019
You are growing x1 dynamically along the third dimension -- you allocate it as dydis by 65 by 1, but you assign into x1(:,:,i) so it keeps getting larger.
You pull the part of x1 that you just assigned into a gpuArray that becomes x. You never use x1 again in your code other than continuing to grow it and copying the latest slice to gpu. You do not return x1.
Therefore it would be more efficient to directly do
x = gpuArray( lab(i*dydis+1-dydis:i*dydis,:) );
Risposte (0)
Vedere anche
Categorie
Scopri di più su GPU Computing in Help Center e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!