How to make my GPU interpolation code faster?
28 visualizzazioni (ultimi 30 giorni)
Mostra commenti meno recenti
I am doing a lot of linear interpolation for a project. MATLAB's griddedInterpolant function for the CPU is fast, and their overloaded function interpn for the GPU is even faster. However, the structure of my code is that I need to do linear interpolation in a for loop where the points to be interpolated are calculated at each loop. Every loop is independent, so I coded my own linear interpolation function to run on the GPU using arrayfun. My function is faster than griddedInterpolant but slower than interpn. I was wondering if anyone could see a way to speed up my function. Or do you think it would benefit from compiling it as a GPU MEX function (I have never done this and it would be a big time cost for me, so I would only like to do it if someone knows it would help a lot). Note that my data does not go off the grid and my grid is uniformly spaced in each dimension. My code is below (sorry for the appearence, it doesn't fit properly. You can find my function of FEX at http://www.mathworks.com/matlabcentral/fileexchange/47437-interp3gpu-m):
function Vi=interp3gpu(x1,x2,x3,V,x1i,x2i,x3i)
Nx1 = length(x1);
Nx2 = length(x2);
Nx3 = length(x3);
x11 = x1(1);
x21 = x2(1);
x31 = x3(1);
s1 = x1(2)-x11;
s2 = x2(2)-x21;
s3 = x3(2)-x31;
Vi = arrayfun(@gpuInterp,x1i,x2i,x3i);
function Vi=gpuInterp(x1i,x2i,x3i)
x1i = (x1i-x11)/s1+1;
x2i = (x2i-x21)/s2+1;
x3i = (x3i-x31)/s3+1;
loc1 = min(Nx1-1,floor(x1i));
loc2 = min(Nx2-1,floor(x2i));
loc3 = min(Nx3-1,floor(x3i));
% x1i now becomes weight for x1i. Overwrite instead of creating new
% variable for memory performance. Same for x2i and x3i.
x1i = x1i-loc1;
x2i = x2i-loc2;
x3i = x3i-loc3;
Vi = (1-x1i)*((1-x2i)*((1-x3i)*V(loc1 +(loc2-1)*Nx1+(loc3-1)*Nx1*Nx2) ...
+ x3i *V(loc1 +(loc2-1)*Nx1+(loc3) *Nx1*Nx2)) ...
+ x2i *((1-x3i)*V(loc1 +(loc2) *Nx1+(loc3-1)*Nx1*Nx2) ...
+ x3i *V(loc1 +(loc2) *Nx1+(loc3) *Nx1*Nx2))) ...
+ x1i*((1-x2i)*((1-x3i)*V(loc1+1+(loc2-1)*Nx1+(loc3-1)*Nx1*Nx2) ...
+ x3i *V(loc1+1+(loc2-1)*Nx1+(loc3) *Nx1*Nx2)) ...
+ x2i *((1-x3i)*V(loc1+1+(loc2) *Nx1+(loc3-1)*Nx1*Nx2) ...
+ x3i *V(loc1+1+(loc2) *Nx1+(loc3) *Nx1*Nx2)));
end
end
2 Commenti
Matt J
il 3 Ago 2014
My function is faster than griddedInterpolant but slower than interpn.
Are you sure you don't mean the other way around? interpn calls griddedInterpolant , so it shouldn't be faster than it.
Risposte (2)
Matt J
il 4 Ago 2014
Modificato: Matt J
il 4 Ago 2014
You're doing a lot of repeat computations of things like 1-loc2, 1-xi1, etc... which might be avoided.
You're also expanding the interpolation formula into 22 multiplies and 8 adds. It should be possible to interpolate a 2x2x2 neighborhood in about 14 multiplies and 7 adds, by doing the interpolation in a separable manner, first along the x axis, then y, then z.
Vedere anche
Categorie
Scopri di più su GPU Computing in MATLAB in Help Center e File Exchange
Prodotti
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!