I need to insert multiple 32x32 arrays into 1024x1024 array at predetermined random locations. How to do this efficiently?

Question

Kevin Hughes il 28 Mag 2016

1
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/286258-i-need-to-insert-multiple-32x32-arrays-into-1024x1024-array-at-predetermined-random-locations-how-t

Commentato: Joss Knight il 9 Giu 2016

I need to insert many (10^6) 32x32 arrays into single 1024x1024 array and sum them. Insertion location is random, but predetermined. Simple for loops result in long execution time. How can I optimize? I do have parallel processing and gpu available.

Nx=ceil(rand(1e6,1)*(1024-32)); %avoiding partial arrays at edges
Ny=ceil(rand(1e6,1)*(1024-32));
smallArray=ones(32);
largeArray=zeros(1024);
for ii=1:1e6
    largeArray(Nx(ii):Nx(ii)+31,Ny(ii):Ny(ii)+31)=largeArray(Nx(ii):Nx(ii)+31,Ny(ii):Ny(ii)+31)+smallArray;
end

2 Commenti
Mostra NessunoNascondi Nessuno

Image Analyst il 29 Mag 2016

How long?

Ahmet Cecen il 29 Mag 2016

On the other hand, I am not entirely sure why this would take any longer than 2 seconds. Is that slow?

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Joss Knight il 8 Giu 2016

1
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/286258-i-need-to-insert-multiple-32x32-arrays-into-1024x1024-array-at-predetermined-random-locations-how-t#answer_224822

Apri in MATLAB Online

Use accumarray (and randi):

N = 1024;
Nx = randi([1 (N-32)], 1e5, 1);
Ny = randi([1 (N-32)], 1e5, 1);
linIndStart = sub2ind([N N], Nx, Ny);
smallArray = gpuArray.ones(32);
[I, J] = ndgrid(1:32);
linOffsets = sub2ind([N N], I(:), J(:)) - 1;
linInd = bsxfun(@plus, linIndStart, linOffsets');
values = repmat(smallArray(:)', numel(Nx), 1); % Replace with actual vals
largeArray = accumarray(linInd(:), values(:), [N*N 1]);
largeArray = reshape(largeArray, N, N);

If your submatrices are genuinely all just ones, then just replace the values(:) with 1.

2 Commenti
Mostra NessunoNascondi Nessuno

Kevin Hughes il 8 Giu 2016

Modificato: Kevin Hughes il 8 Giu 2016

Apri in MATLAB Online

Joss, Thanks. Actually got slightly better (8.3 vs 9.7) results with:

Nx=1024;
Np=100000;
Ns=32;
smallArray=gpuArray.ones(Ns)+1i*gpuArray.ones(Ns);
smallArrayr=reshape(smallArray,1,Ns^2);
smallArrayrr=gpuArray.ones(Np,1)*smallArrayr;
% [x(:) y(:)] will be a list of the Ns x Ns offset coordinates for
% each box element
[x, y] = meshgrid(1:Ns);
x=gpuArray(x);
y=gpuArray(y);
%random insertion locations
insertX=gpuArray(round(ceil(rand(Np,1)*(Nx-Ns))));
insertY=gpuArray(round(ceil(rand(Np,1)*(Nx-Ns))));
nx=gpuArray(insertX);
ny=gpuArray(insertY);
% Add every offset to every coordinate. The result is 256x16, that
% is one row per box, with 16 elements in each box. Now x and y
% identify every element of every box.
xx = bsxfun(@plus, nx(:), x(:)');
yy = bsxfun(@plus, ny(:), y(:)');
% Now we have arrays that identify what to add together and where
% to put the results. accumarray can do the accumulation based
% on the list of coordinates [x(:) y(:)] and the data going into
% each element bmatrixx(:) and bmatrixy(:).
largeArray = accumarray([xx(:) yy(:)], smallArrayrr(:), [Nx Nx], @sum);

Code found on this site at related link. (Actually your answer to someone else with similar issue) http://www.mathworks.com/matlabcentral/answers/264457-why-for-gpu-is-slower-than-cpu-for-this-code-is-it-because-of-sparsity-or-because-of-for-loop Sets up smallArrays a little differently, but uses accumarray.

Joss Knight il 9 Giu 2016

Oh well done - I knew I'd answered basically the same question before but I couldn't unearth the old answer!

Accedi per commentare.

Answer 2

Ahmet Cecen il 29 Mag 2016

1
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/286258-i-need-to-insert-multiple-32x32-arrays-into-1024x1024-array-at-predetermined-random-locations-how-t#answer_223747

If your small array will always be the same, you might be able to do this lightning fast with a convolution. Instead of using the random ok indices as corners, you would use them as centers. Do something like:

LargeArray(randindexvector)=1;

You can separately find repetitions using unique and add them if you happen to get the same center twice. Much smaller for loop.

Then you convolve with a 16x16 ones matrix.

There might be some rough edges in this approach that needs some smoothing, but if you can get it to work, it would be milliseconds.

2 Commenti
Mostra NessunoNascondi Nessuno

Kevin Hughes il 29 Mag 2016

Thanks. Small array is actually complex. As written I get about 20 seconds execution time. But 10^6 could be order of magnitude low and I'll need to repeat with variations on insertion locations a few thousand times which puts total at 160 hours...Not sure assumption of constant small array holds or convolution would be excellent way to go.

The first related question links to a vectorized answer that also shows promise.

Ahmet Cecen il 29 Mag 2016

Apri in MATLAB Online

You can cascade on stages of uniqueness and vectorize the fill. Say you created 1000 random values, and say 100 of them are duplicates. You first work on the 900, then 100, then any duplicates within the 100 etc. You do something like:

 for i = 1:BiggestNumberofRepetations
    WORK HERE
 end

Hopefully this is an orders of magnitude smaller iteration. This allows you to iterate over the dimensions of the smaller matrix instead of the number of random numbers. I will burst the memory here but you can work around it. It should look something like:

 RandInd = sub2ind([1024 1024],Nx,Ny);
 for i = 1:32*32
     largeArray(RandInd+floor((i-1)/32)*1024)+mod(i,32)) = largeArray(RandInd+floor((i-1)/32)*1024)+mod(i,32)) + smallArray(i);
end

Now there is some nasty indexing there so double check that. Now your iteration count is MaximumRepetationsCount*numel(smallArray), which in most cases should be still orders of magnitude smaller.

Accedi per commentare.

I need to insert multiple 32x32 arrays into 1024x1024 array at predetermined random locations. How to do this efficiently?

2 Commenti
Mostra NessunoNascondi Nessuno

Risposta accettata

2 Commenti
Mostra NessunoNascondi Nessuno

Più risposte (1)

2 Commenti
Mostra NessunoNascondi Nessuno

Vedere anche

Categorie

Tag

Community Treasure Hunt

I need to insert multiple 32x32 arrays into 1024x1024 array at predetermined random locations. How to do this efficiently?

2 Commenti Mostra NessunoNascondi Nessuno

Risposta accettata

2 Commenti Mostra NessunoNascondi Nessuno

Più risposte (1)

2 Commenti Mostra NessunoNascondi Nessuno

Vedere anche

Categorie

Tag

Community Treasure Hunt

2 Commenti
Mostra NessunoNascondi Nessuno

2 Commenti
Mostra NessunoNascondi Nessuno

2 Commenti
Mostra NessunoNascondi Nessuno