"switch" like functionality on GPUarray

1 visualizzazione (ultimi 30 giorni)
Vivek
Vivek il 5 Feb 2016
Commentato: Vivek il 24 Feb 2016
I am running code that involves a Markov chain process, and I would like to implement it so that 10,000+ such transitions can be simultaneously performed on the GPUarray.
Basically my Markov chain involves 8 states, but could involve an arbitrary number, with arbitrary couplings/transitions. What I'd really like to do is use arrayfun, and have the code describing transition probabilities be contained in a standard switch statement (not currently supported on GPU). Some generalized code is below:
function newstate=junk(oldstate)
switch(oldstate)
case 1
if (condition1, determined partially by a random number)
newstate=6
if (condition2, determined partially by a random number)
newstate=4
case 2
....
case 8
...
return
and then store "oldstates" on the GPU, and call: newstates=arrayfun(@junk, oldstates);
I guess I am wondering if there is a more elegant way to do this than to do serial if and if-else statements in my "junk" function that is passed to arrayfun. If I am way off on this, please let me know a better way.
  1 Commento
Adam
Adam il 5 Feb 2016
I'm far from an expert in GPU programming, but from what I know you really don't want to be executing switch-type functionality on a GPU even if it were supported. GPU cores are highly optimised for doing mass parallel calculations with each core running effectively the same code but on different input data, but not for decision trees that fork code off onto one of numerous paths which would cause each core to be running different code at any given moment.

Accedi per commentare.

Risposta accettata

Joss Knight
Joss Knight il 15 Feb 2016
Any switch statement can be reformulated as a sequence of if, elseif statements, which is supported by GPU arrayfun, so you can write the code you want.
Because of the potential cost of branches inside a GPU kernel as articulated in Adam's comment, it's possible you'll get better performance with conditional execution using masks.
newstates = oldstates;
for s = 1:numstates
mask = oldstates == s;
newstates(mask) = arrayfun(@junk, oldstates(mask));
end
This incurs the cost of launching one kernel per state, so you'd have to experiment to see whether it was actually faster. If the code in your arrayfun kernel really is so simple then you'll probably find it's cheaper to use the one kernel. Branching is only more costly if there's a lot happening inside each conditional clause (or a lot of mutually exclusive clauses).
  1 Commento
Vivek
Vivek il 24 Feb 2016
Thanks to you and to Adam for these thoughtful comments. Using a mask is elegant. I will attempt both and compare performance.

Accedi per commentare.

Più risposte (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by