How to efficiently do grouped subtraction?

12 visualizzazioni (ultimi 30 giorni)
Derek Smith
Derek Smith il 20 Mar 2021
Modificato: Matt J il 21 Mar 2021
I would like to subtract 2 arrays. The first array is in groups of three. From each of these groups, I would like to do element-wise subtraction and summing of the results. The result element should include 1 element for each input group. The following example demonstrates what I am looking for. I would use a for-loop, but the operations which generate these values are nominally done on the GPU; I would like to include the following operation as gpu-enabled as well. At minimum, I would like to "flip" the dimension on which the loop iterates. Resizing the main array is not ideal, because it would need to be restored for the next iteration of the algorithm (of which there are millions).
yt = [ 1;1;1 ; 2;2;2 ; 3;3;3 ; 4;4;4 ; 5;5;5 ; 6;6;6 ; 7;7;7 ]; % 7 groups, now. Usually 20,000 or so.
at = [ 1;1;1 ]; % this size is normal.
No = numel(at);
Np = numel(yt) / No;
for i = 1 : Np
lb = (i-1)*No + 1;
ub = i*No;
f(i) = sum( yt(lb:ub) - at(lb:ub) );
end
% when done
f = [ 0 ; 3 ; 6 ; 9 ; 12 ; 15 ; 18 ]
  1 Commento
Raymond Norris
Raymond Norris il 21 Mar 2021
I'm adding this as a comment and not a solution, because I think you're looking to run this on the GPU, which I haven't tested.
A few thoughts
  • In your example, I believe you mean
f(i) = sum( yt(lb:ub) - at );
Instead of
at(lb:ub)
  • You say that would do this in a for-loop, but you want to run this on a GPU. But the example you're giving is in a for-loop. Are you saying, conceptually, this is the result you'd like, but to run it on a GPU?
  • Are yt and at GPU arrays? I'm gathering they are since you say the operations that generate these values (yt, at?) are normally done on the GPU. If so, I'd expect this to simply run on a GPU already.
  • I don't following what you mean by flipping the dimension, nor the issue with resizing the main array (f?). Can you not run
f = rand(1,Np);
or perhaps better (if the RHS is on the GPU, f would also be a GPU array).
f = rand.gpuArray(1,NP);
With all that said, the following appears to be an order of magnitude faster. Replace
for i = 1 : Np
lb = (i-1)*No + 1;
ub = i*No;
f(i) = sum( yt(lb:ub) - at(lb:ub) );
end
with
f = sum( reshape(yt - repmat(at,Np,1),No,Np) );
Again, I'm not testing this on the GPU, so this might not be what you're looking for.

Accedi per commentare.

Risposte (1)

Matt J
Matt J il 21 Mar 2021
Modificato: Matt J il 21 Mar 2021
Using sepblockfun from the File Exchange,
yt=gpuArray(yt); at=gpuArray(at);
f=sepblockfun(yt,size(at),'sum')-sum(at);

Prodotti


Release

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by