How can I speed up my code

Question

smarthu il 6 Giu 2022

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/1734805-how-can-i-speed-up-my-code

Modificato: Jan il 6 Giu 2022

I just started learning Matlab.

I follow the help and use the code shown below (Matlab Full) to calculate the Mandelbrot Set, z(n+1)=z(n)^2+C (30000 iterations). The code uses GPU.

It is compared to a program I make by using VB.net, which, I also don't know much and the code is not optimized. The code uses CPU because I don't know how to use GPU in VB.

The (Matlab Part) code is a modification of the Matlab code, in which I try to calculate part of the pixels instead of the full image. The code uses GPU, but only 84%~90% depending on the region calculating.

For each type of code, I calculate three regions (with same number of pixels):

(1) Divergent, small region around Creal=-2.7, Cimag=0.5, where all C values give divergent results. VB program takes only 0.14s to complete, since the VB program uses for loop to go through all pixels and at each pixel if the calculation diverges, the iteration will stop and it will move on to the next pixel. The matlab code, on the contrary, takes a lot of time since it needs to go through all the iterations.

(2) Full region, covering Creal from -2.8 to 1.5, Cimag centered at 0. This region shows the whole Mandelbrot Set structure and contains both divergent and convergent regions.

(3) Convergent, small region around Creal=0 and Cimag=0, where all C values give convergent results. Both VB and Matlab programs need to go through all pixels. But the Matlab Full code takes much less time to complete.

So my questions are:

(1) How can I speed up my (Matlab Full) code? Now, the code needs to calculate all pixels (the whole z matrix) even if some of them diverged. And the code needs to go through all iterations even if all pixels diverged. I try to calculate some of the pixels only, as shown in (Matlab Part) code. But the code takes much longer time, as shown in the figure.

(2) The (Matlab Part) code taks much longer time. It takes 84% to 90% of the GPU instead of 100% as in (Matlab Full) case. And (Matlab Part) code takes 100% of one of the CPU thread. Why is this?

(3) In each region, the (Matlab Full) code will need to go through all iterations. I expect it to take the same amount of time to complete. But no, it takes much less time to complete in the convergent region. Why is this difference?

(4) This is not a question. From the comparison of (Matlab Full) and VB results at convergent region, we see that Matlab code runs much faster than my (not optimized) VB code. I expect the Matlab code will be much faster at other regions if it doesn't need to go through all iterations.

Calculation Time Figure:

(Matlab Full) Code:

        function ButtonGPUPushed(app, event)
            %Drawing size
            XN=1500;
            YN=830;
            %User set drawing region
            X1=str2double(app.X1EditField.Value);
            X2=str2double(app.X2EditField.Value);
            Y1=str2double(app.Y1EditField.Value);
            Y2=Y1+YN/XN*(X2-X1);
            %axes and grid
            XX=gpuArray.linspace(X1,X2,XN);
            YY=gpuArray.linspace(Y1,Y2,YN);
            [ttt,vvv]=meshgrid(XX,YY);
            CU=complex(ttt,vvv);
            count=zeros(size(CU),'gpuArray');
            %Custom color map
            CStepSize=201;
            Cin=[0 0.15 0.25;0 0.6 1;1 0.6 0;0 1 0;0.6 0 1;1 0 0.6;0 0 0];
            Cblock=Cin(1,:);
            for k=1:(size(Cin,1)-1)
                ooo=[(linspace(Cin(k,1),Cin(k+1,1),CStepSize))' (linspace(Cin(k,2),Cin(k+1,2),CStepSize))' (linspace(Cin(k,3),Cin(k+1,3),CStepSize))'];
                ooo=ooo(2:end,:);
                Cblock=[Cblock; ooo];
            end
            %Iteration
            zz=CU;
            for k=1:(size(Cblock,1)-1)                
                zz=zz.*zz+CU;
                inside=(abs(zz)<=2);
                count=count+inside;
            end
            %Display
            app.UIAxes.XLim=[X1 X2];
            app.UIAxes.YLim=[Y1 Y2];
            colormap(app.UIAxes, Cblock)
            ii=image(app.UIAxes,[X1 X2],[Y1 Y2],count);
            set(app.UIAxes,"YDir","normal")
            set(ii,"HitTest","off")
        end

(Matlab Part) Code, the 'For' iteration part is changed to:

            for k=1:(size(Cblock,1)-1)                
                inside=(abs(zz)<=2);
                count=count+inside;
                zz(inside)=zz(inside).*zz(inside)+CU(inside);
            end

2 Commenti
Mostra NessunoNascondi Nessuno

dpb il 6 Giu 2022

MATLAB works best when it can be vectorized -- branching and convergence tests and portions of an array which need operations unique to either the iteration or location break that model and can be quite difficult to improve on over the linear structure. It's just a result of the fundamental design of the language.

It's been too long since I've played with the Mandlebrot set to remember about the convergence test details on iterations vs decision to have any specific recommendations about a MATLAB implementation, just the general comment that it may not turn out to be easy (or even possible) to gain too much.

Jan il 6 Giu 2022

Modificato: Jan il 6 Giu 2022

Apri in MATLAB Online

By the way, a more compact way to create Cblock:

CStepSize = 201;
Cin       = [0 0.15 0.25;0 0.6 1;1 0.6 0;0 1 0;0.6 0 1;1 0 0.6;0 0 0];
nCin      = size(Cin, 1);
nCblock   = (nCin - 1) * (CStepSize - 1) + 1;
Cblock    = interp1(1:nCin, Cin, linspace(1, nCin, nCblock));

meshgrid() can be replaced by implicit expanding usually:

XX = gpuArray.linspace(X1,X2,XN);
YY = gpuArray.linspace(Y1,Y2,YN);
CU = XX + 1i * YY.';

Accedi per commentare.

Accedi per rispondere a questa domanda.

How can I speed up my code

2 Commenti
Mostra NessunoNascondi Nessuno

Risposte (0)

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

How can I speed up my code

2 Commenti Mostra NessunoNascondi Nessuno

Risposte (0)

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

2 Commenti
Mostra NessunoNascondi Nessuno