GPU performance with short vectors
1 visualizzazione (ultimi 30 giorni)
Mostra commenti meno recenti
MatlabNinja
il 30 Mar 2016
Modificato: Joss Knight
il 20 Apr 2016
Hello - I see GPU computation underperforming when used for vector manipulation with short lengths.
>> a = rand(1000000, 100,'gpuArray');
>> b= gather(a);
>> tic; for i=1:100 ; eval('q = zeros(1000000,1);for i = 1:100; q = b(:,i)+q;end') ; end;doc
Elapsed time is 45.489811 seconds.
>>tic; for i=1:100 ; eval('qq = zeros(1000000,1);for i = 1:100; q = a(:,i)+q;end') ; end;toc
Elapsed time is 0.875140 seconds.
same when done for short vectors see GPU computation under performing:
>> a = rand(200, 100,'gpuArray');
>>b= gather(a);
>> tic; for i=1:100 ; eval('q = zeros(200,1);for i = 1:100; q = b(:,i)+q;end') ; end;doc
Elapsed time is 0.021727 seconds.
>>tic; for i=1:100 ; eval('qq = zeros(200,1);for i = 1:100; q = a(:,i)+q;end') ; end;toc
Elapsed time is 0.833865 seconds.
Any insight will be appreciated.
Thank you.
0 Commenti
Risposta accettata
Joss Knight
il 20 Apr 2016
Modificato: Joss Knight
il 20 Apr 2016
Computation in a GPU core is significantly slower than in a modern CPU core. It makes up for that by having a lot of them - thousands. If you don't give it thousands of things to do at once, you're never going to beat the CPU.
In your simple computation above you are unnecessarily using a loop. This may have been for illustrative purposes, but if it reflects your actual code, you will gain back your performance by removing the loop, i.e.
q = sum(a, [], 2);
0 Commenti
Più risposte (1)
Vedere anche
Categorie
Scopri di più su GPU Computing in Help Center e File Exchange
Prodotti
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!