GPU for loop parallelization

Question

Alexander Voznesensky il 24 Mag 2018

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/402453-gpu-for-loop-parallelization

Risposto: Edric Ellis il 25 Mag 2018

Hi there! Is it possible to perform for loops (i,j) on GPU here? I know about arrayfun in MATLAB, but in this case i don't sure is it possible to use it?

% Параметры геометрии восстановления
global fw fsemiW fpix fdelta fL fCentralZ
fw = 2000.0;
fsemiW = 1000.0;
fpix = 145.0;
fdelta = 23.0;
fL = 243000.0;
fCentralZ = 0.0;
J = uint16(zeros(fw,fw));
path1='tooth';
list=dir(path1);
path2=cd;
N=size(list,1)-2; % Это количество файлов и папок
angle=(0:N-1)*(2*pi)/N;
si=sin(angle);
co=cos(angle);
needSlice=900;
for k = 1:N
      fullpath=strcat(path2,'\tooth\',list(k+2).name);
      I=imread(fullpath,'tiff');
      for i=1:size(I,1)
          for j=1:size(I,2)
              [ind_result, ind_proj]=backProjectionKernel(i, j, si(k), co(k), needSlice);
              J(ind_result) = J(ind_result) + I(ind_proj);
          end
      end
end
imshow(J,[]);
function [ind_result, ind_proj] = backProjectionKernel(i, j, si, co, needSlice)
global fw fsemiW fpix fdelta fL fCentralZ
% Предварительные расчеты
xp = co * (i - fsemiW) + si * (j - fsemiW);
yp = co * (j - fsemiW) - si * (i - fsemiW);
zp = fsemiW - needSlice;
tmp = fL / (fL - (xp*fpix));
yproj = yp * tmp;
zproj = zp * tmp;
Nproj = yproj + fsemiW;
Z = fsemiW - zproj + fCentralZ;
% Индекс массива восстанавливаемого слоя
ind_result = j + (i-1) * fw;
% Индекс массива текущей угловой проекции
ind_proj = Z * fw + Nproj + fdelta;
ind_proj=round(ind_proj);
end

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Edric Ellis il 25 Mag 2018

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/402453-gpu-for-loop-parallelization#answer_321911

Apri in MATLAB Online

This can be done on the GPU I think, but you will definitely need to get rid of your global variables. This doc page should help here - you need to parameterise your backProjectionKernel function to avoid it needing to use global.

The main trick here is to split the loop into two pieces - firstly, an arrayfun portion that performs the independent calls to backProjectionKernel. This relies on the implicit dimension expansion to "loop" over two dimensions. Then, the second piece uses accumarray to build up the result.

szI = 5;
szJ = 10;
iVec = gpuArray(1:szI);
jVec = gpuArray(1:szJ)';
% First, perform a dummy computation that calculates the indices of J
% to accumulate into, along with the indices of I from which to take
% the values. Note that the gpuArray version of arrayfun performs
% implicit dimension expansion, so this is effectively a double-loop
% over all combinations of iVec and jVec
[indJ, indI] = arrayfun(@iDummyCalc, iVec, jVec);
% Build dummy I
I = rand(szI, 'gpuArray');
% Compute J using accumarray - first as a vector...
J = accumarray(indJ(:), I(indI(:)), [szJ * szJ, 1]);
% ... then reshape to the matrix
J = reshape(J, [szJ, szJ]);
function [indJ, indI] = iDummyCalc(iVal, jVal)
% Compute some dummy values that are valid linear indices into
% I and J.
indJ = randi([jVal 100]);
indI = randi([iVal 25]);
end

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

GPU for loop parallelization

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposte (1)

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Vedere anche

Categorie

Tag

Community Treasure Hunt

GPU for loop parallelization

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposte (1)

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Vedere anche

Categorie

Tag

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti