what is the role of streaming multiprocessor(multiprocessorcount in gpuDevice()) on GPU coder?

Question

lim daehee il 13 Gen 2020

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/500069-what-is-the-role-of-streaming-multiprocessor-multiprocessorcount-in-gpudevice-on-gpu-coder

Risposto: Aditya Patil il 12 Lug 2021

Apri in MATLAB Online

I use the GPU coder app with my graphic card, GeForce GTX 1070Ti and I found an issue from the result.

I simulated my code with 1000, 1200, 1400, 1600, 1800 and 1900 nodes.

The elapsed time goes shorter when the number of nodes are small until the number of noes is 1800.

However, when I simulated with 1900 nodes, the elapsed time is much faster than the simulation with 1000 nodes.

I suppose that this becomes due to the MultiprocessorCount of my graphic card and the MultiprocessorCount of my graphic card is 19.

I wonder what is the role of MultiprocessorCount and I want to know the exact reason why the elapsed time with 1900 nodes is faster than the elapsed time of 1000 nodes.

Also, here is my code.

function [D, N, B, R] = fcn_PRM_DH_complete(node_PRM,PosMap,Map_Obs,Obs_mat,CR) %#codegen
n = length(node_PRM);
D = coder.nullcopy(zeros(n));
N = coder.nullcopy(zeros(n));
B = coder.nullcopy(ones(n));
coder.gpu.kernel;
for i0=1:n-1
    coder.gpu.kernel;
    for j0=i0+1:n
        b_val=1;
        B(i0,j0)=b_val;
    end
end
R = coder.nullcopy(zeros(n));
len=(1:n)';
pos_mat=PosMap(node_PRM(len),:);
coder.gpu.kernel;
for i1=1:n-1
    coder.gpu.kernel;
    for j1=i1+1:n
        dist=sqrt((pos_mat(i1,1)-pos_mat(j1,1))^2+(pos_mat(i1,2)-pos_mat(j1,2))^2+5*(pos_mat(i1,3)-pos_mat(j1,3))^2);
        D(i1,j1) = dist;
        
        C = collisionCheck_SP8(pos_mat(i1,:), pos_mat(j1,:));
        
        N(i1,j1)=C;
    end
end
coder.gpu.kernel;
for i2=1:n-1 
    coder.gpu.kernel;
    for j2=i2+1:n
        coder.gpu.kernel;
        for k=1:N(i2,j2)
            if D(i2,j2)<=CR 
                node1=pos_mat(i2,:);
                node2=pos_mat(j2,:);
                
                dx=(node2(1)-node1(1))/N(i2,j2);
                x=node1(1)+dx*k;
                dy=(node2(2)-node1(2))/N(i2,j2);
                y=node1(2)+dy*k;
                dz=(node2(3)-node1(3))/N(i2,j2);
                z=node1(3)+dz*k;
                node_x=round(x/0.3);
                node_y=round(y/0.3); 
                node_z=round(z/0.3+1); 
                
                Idxpt=node_x+(node_y-1)*size(Map_Obs,2)+(node_z-1)*size(Map_Obs,1)*size(Map_Obs,2); 
                if Obs_mat(Idxpt)==0 
                    b=0;
                    B(i2,j2)=b;
                end
            else 
                b=0;
                B(i2,j2)=b;
            end
        end
    end
end
coder.gpu.kernel;
for i3=1:n-1
    coder.gpu.kernel;
    for j3=i3+1:n
        d_val=D(i3,j3);
        b_val=B(i3,j3);
        R(i3,j3)=d_val*b_val;
        R(j3,i3)=d_val*b_val;
    end
end

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Aditya Patil il 12 Lug 2021

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/500069-what-is-the-role-of-streaming-multiprocessor-multiprocessorcount-in-gpudevice-on-gpu-coder#answer_744798

Streaming Multiprocessors (SMs) is a concept from Nvidia GPUs, where each SM processes threads in parallel. More the number of SMs, more the computational power of the GPU.

As to the reason for performance increase, there can be many distinct reasons, which can be difficult to predict based on the MATLAB code itself and requires looking at the generate code and benchmarking it. However, one reason can be that this specific number of nodes avoids some memory contentions (where there are reads/writes to same memory bank).

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

what is the role of streaming multiprocessor(multiprocessorcount in gpuDevice()) on GPU coder?

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposte (1)

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

what is the role of streaming multiproce​ssor(multi​processorc​ount in gpuDevice()) on GPU coder?

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposte (1)

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

what is the role of streaming multiprocessor(multiprocessorcount in gpuDevice()) on GPU coder?

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti