Performance of cell arrays, multi-dimensional arrays, structure arrays, and multiple variables

Question

1 voto

I've found that the performance of cell Arrays, multi-dimensional arrays, and structure arrays are all far slower than the use of multiple independent variables. Thus, my optimized codes have been using super long if-statements and functions which completely write out which variables are needed or not. The code is thus faster than using the other methods, but the code is also quite messy. A reproduction of the independent variables approach using 'eval' also doesn't work because of the way eval is executed by MATLAB.

Is there another way to do this without writing out each variable? Perhaps there I could use cell arrays but pass an argument which results in MATLAB treating them in a faster way... This might not seem like a big deal, but my codes take dozens of hours to execute (itt>1000), which becomes many days when the other 'MATLAB-preferred' methods are used.

Here is a code showing this performance. Values of ytessel and xtessel can be changed as you please, but changing these values requires manually changing the code for the fastest method (independent variables).

 %%model prep
 ytessel = 4;
 xtessel = 4;
 nnum = 72*1;
 ynum = nnum*ytessel;
 xnum = nnum*xtessel;
 itt  = 5;
 totemit = rand(ynum,xnum);
 T       = zeros(ynum,xnum);
 for y=1:nnum
     for x=1:nnum
         dT_depot{y,x} = rand(ynum,xnum);
     end
 end
 %%CELL ARRAYS
 tic
 for i=1:itt
     for yt=1:ytessel
         for xt=1:xtessel
             A{yt,xt}   = zeros(ynum,xnum);
         end
     end
     for y=1:nnum
         for x=1:nnum
             dT = dT_depot{y,x};
             for yt=1:ytessel
                 for xt=1:xtessel
                     A{yt,xt} = A{yt,xt} + dT.*totemit(y+(yt-1)*nnum,x+(xt-1)*nnum);
                 end
             end
         end
     end
 end
 disp(sprintf('Time: %f (Cell Array Method)',toc));
 %%4D Matrix
 tic
 for i=1:itt
     C = zeros(ynum,xnum,ytessel,xtessel);
     for y=1:nnum
         for x=1:nnum
             for yt=1:ytessel
                 for xt=1:xtessel
                     dT           = dT_depot{y,x};
                     C(:,:,yt,xt) = C(:,:,yt,xt) + dT.*totemit(y+(yt-1)*nnum,x+(xt-1)*nnum);
                 end
             end
         end
     end
 end
 disp(sprintf('Time: %f (4D Array Method)',toc));
 %%Independent Variables
 D11   = zeros(ynum,xnum);  D12  = zeros(ynum,xnum);  D13  = zeros(ynum,xnum);  D14  = zeros(ynum,xnum);
 D21   = zeros(ynum,xnum);  D22  = zeros(ynum,xnum);  D23  = zeros(ynum,xnum);  D24  = zeros(ynum,xnum);
 D31   = zeros(ynum,xnum);  D32  = zeros(ynum,xnum);  D33  = zeros(ynum,xnum);  D34  = zeros(ynum,xnum);
 D41   = zeros(ynum,xnum);  D42  = zeros(ynum,xnum);  D43  = zeros(ynum,xnum);  D44  = zeros(ynum,xnum);
 tic
 for i=1:itt
     D11(:)  = 0;  D12(:) = 0;  D13(:) = 0;  D14(:) = 0;
     D21(:)  = 0;  D22(:) = 0;  D23(:) = 0;  D24(:) = 0;
     D31(:)  = 0;  D32(:) = 0;  D33(:) = 0;  D34(:) = 0;
     D41(:)  = 0;  D42(:) = 0;  D43(:) = 0;  D44(:) = 0;
     for y=1:nnum
         for x=1:nnum
             curdepot = dT_depot{y,x};
             D11 = D11  + curdepot.*totemit(y,       x);
             D21 = D21  + curdepot.*totemit(y+nnum,  x);
             D31 = D31  + curdepot.*totemit(y+nnum*2,x);
             D41 = D41  + curdepot.*totemit(y+nnum*3,x);
             D12 = D12  + curdepot.*totemit(y,       x+nnum);
             D22 = D22  + curdepot.*totemit(y+nnum,  x+nnum);
             D32 = D32  + curdepot.*totemit(y+nnum*2,x+nnum);
             D42 = D42  + curdepot.*totemit(y+nnum*3,x+nnum);
             D13 = D13  + curdepot.*totemit(y,x+nnum*2);
             D23 = D23  + curdepot.*totemit(y+nnum,x+nnum*2);
             D33 = D33  + curdepot.*totemit(y+nnum*2,x+nnum*2);
             D43 = D43  + curdepot.*totemit(y+nnum*3,x+nnum*2);
             D14 = D14  + curdepot.*totemit(y,x+nnum*3);
             D24 = D24  + curdepot.*totemit(y+nnum,x+nnum*3);
             D34 = D34  + curdepot.*totemit(y+nnum*2,x+nnum*3);
             D44 = D44  + curdepot.*totemit(y+nnum*3,x+nnum*3);
         end
     end
 end
 disp(sprintf('Time: %f (Independent Variables Method)',toc));
 %%STRUCT ARRAYS
 for yt=1:ytessel
     for xt=1:xtessel
         DD(yt,xt).dat = zeros(ynum,xnum);
     end
 end
 tic
 for i=1:itt
     for yt=1:ytessel
         for xt=1:xtessel
             DD(yt,xt).dat = zeros(ynum,xnum);
         end
     end
     for y=1:nnum
         for x=1:nnum
             curdepot = dT_depot{y,x};
             for yt=1:ytessel
                 for xt=1:xtessel
                     DD(yt,xt).dat = DD(yt,xt).dat + curdepot.*totemit(y+(yt-1)*nnum,x+(xt-1)*nnum);
                 end
             end           
         end
     end
 end
 disp(sprintf('Time: %f (Structure Arrays Method)',toc));
 %%Eval Method
 for yt=1:ytessel
     for xt=1:xtessel
         eval(['D' num2str(yt) num2str(xt) '=zeros(ynum,xnum);']);
     end
 end
 tic
 for i=1:itt
     for yt=1:ytessel
         for xt=1:xtessel
             eval(['D' num2str(yt) num2str(xt) '(:)=0;']);
         end
     end
     for y=1:nnum
         for x=1:nnum
             curdepot = dT_depot{y,x};
             for yt=1:ytessel
                 for xt=1:xtessel
                     eval(['D' num2str(yt) num2str(xt) '=D' num2str(yt) num2str(xt) '+curdepot.*totemit(y+nnum*' num2str(yt-1) ',x+nnum*' num2str(xt-1) ');']);
                 end
             end
         end
     end
 end
 disp(sprintf('Time: %f (Eval Method)',toc));

Output:

 Time: 16.279780 (Cell Array Method)
 Time: 79.975502 (4D Array Method)
 Time: 7.808595 (4D Independent Variables Method)
 Time: 20.406543 (Structure Arrays Method)
 Time: 127.129048 (Eval Method)

8 Commenti
Mostra 6 commenti meno recenti Nascondi 6 commenti meno recenti

Christopher il 8 Feb 2018

Modificato: Christopher il 8 Feb 2018

Apri in MATLAB Online

You are all right about the bad preallocation in the cell and multi-dimensional arrays, but this is obviously not the main issue. I added proper preallocation and find that timings are slightly improved:

 Time: 12.829600 (Cell Array Method)
 Time: 74.119861 (4D Array Method)
 Time: 7.758579 (Independent Variables Method)
 Time: 15.783543 (Structure Arrays Method)

However, the efficiency of some methods dramatically decreases when the size of these matrices changes. For example, changing the setup parameters to (others being the same):

 ytessel = 4;  % same size
 xtessel = 2;  % half size
 nnum    = 144; % double size (4x elements for each matrix)

gives

 Time: 325.490661 (Cell Array Method)
 Time: 493.344934 (4D Array Method)
 Time: 32.752160 (Independent Variables Method)
 Time: 319.870357 (Structure Arrays Method)

Cell arrays are now 10x slower than independent variables. Note that this should not result from a memory overload. I have 64 GB and I don't see more than 28 GB in use.

Secondly, how would these loops be better vectorized without using far more memory? I know that MATLAB's for-loops are notoriously inefficient, but I doubt this is the problem in this case. In any case, it seems like my code demonstrates that the use of cell and multi-dimensional arrays is inherently slow, thus even if the for-loop were somehow removed the inefficiency would remain.

Lastly, Stephen, absolutely you may add this as an example of slow eval. On this subject, I think it's quite odd that Mathworks hasn't simply implemented an eval-type function which just tells MATLAB to literally write out the expressions before compiling.

Christopher il 8 Feb 2018

Apri in MATLAB Online

Better preallocated Cell array method:

 % CELL ARRAYS
 for yt=1:ytessel
    for xt=1:xtessel
        A{yt,xt} = zeros(ynum,xnum);
    end
 end
 tic
 for i=1:itt
    for yt=1:ytessel
        for xt=1:xtessel
            A{yt,xt}(:) = 0;
        end
    end
    for y=1:nnum
        for x=1:nnum
            dT = dT_depot{y,x};
            for yt=1:ytessel
                for xt=1:xtessel
                    A{yt,xt} = A{yt,xt} + dT.*totemit(y+(yt-1)*nnum,x+(xt-1)*nnum);
                end
            end
        end
    end
 end
 disp(sprintf('Time: %f (Cell Array Method)',toc));

better preallocated multi-dimensional method:

 % 4D Matrix
 C = zeros(ynum,xnum,ytessel,xtessel);
 tic
 for i=1:itt
    C(:) = 0;
    for y=1:nnum
        for x=1:nnum
            for yt=1:ytessel
                for xt=1:xtessel
                    dT           = dT_depot{y,x};
                    C(:,:,yt,xt) = C(:,:,yt,xt) + dT.*totemit(y+(yt-1)*nnum,x+(xt-1)*nnum);
                end
            end
        end
    end
 end
 disp(sprintf('Time: %f (4D Array Method)',toc));

Better preallocated Struct array method:

 %%STRUCT ARRAYS
 for yt=1:ytessel
    for xt=1:xtessel
        DD(yt,xt).dat = zeros(ynum,xnum);
    end
 end
 tic
 for i=1:itt
    for yt=1:ytessel
        for xt=1:xtessel
            DD(yt,xt).dat(:) = 0;
        end
    end
    for y=1:nnum
        for x=1:nnum
            curdepot = dT_depot{y,x};
            for yt=1:ytessel
                for xt=1:xtessel
                    DD(yt,xt).dat = DD(yt,xt).dat + curdepot.*totemit(y+(yt-1)*nnum,x+(xt-1)*nnum);
                end
            end
        end
    end
 end
 disp(sprintf('Time: %f (Structure Arrays Method)',toc));

Jan il 8 Feb 2018

Apri in MATLAB Online

I know that MATLAB's for-loops are notoriously inefficient, ...

This is an old rumor from the times before the JIT was introduced: R6.5 in 2002.

Jos (10584) il 8 Feb 2018

I know that MATLAB's for-loops are notoriously inefficient, ...

Fake news! :D

Accedi per commentare.

Accedi per rispondere a questa domanda.

Follow Question

Answer 1

Matt J il 8 Feb 2018

Modificato: Matt J il 9 Feb 2018

Apri in MATLAB Online

1 voto

Secondly, how would these loops be better vectorized without using far more memory?

As below. I reduced the test data size to something my laptop could handle better, but the improvement

    Time: 0.813528 (Independent Variables Method)
    Time: 6.088266 (4D Array Method with Loops)
    Time: 0.164137 (Array Method Vectorized)

is pretty clear.

     %%model prep
     ytessel = 4;
     xtessel = 4;
     nnum = 36*1;
     ynum = nnum*ytessel;
     xnum = nnum*xtessel;
     itt  = 5;
     totemit = rand(ynum,xnum);
      for y=1:nnum
         for x=1:nnum
             dT_depot{y,x} = rand(ynum,xnum);
         end
     end
      %%LOOPS
     tic
     for i=1:itt
         C = zeros(ynum,xnum,ytessel,xtessel);
         for y=1:nnum
             for x=1:nnum
                 for yt=1:ytessel
                     for xt=1:xtessel
                         dT           = dT_depot{y,x};
                         C(:,:,yt,xt) = C(:,:,yt,xt) + dT.*totemit(y+(yt-1)*nnum,x+(xt-1)*nnum);
                     end
                 end
             end
         end
     end
     disp(sprintf('Time: %f (4D Array Method with Loops)',toc));
     dT=reshape(cell2mat(dT_depot(:).'), ynum*xnum,nnum*nnum);
     tmp=mat2cell(totemit,ones(1,ytessel)*nnum, ones(1,xtessel)*nnum);
     S=reshape(cell2mat(tmp(:).'),nnum*nnum,[]);
       %%VECTORIZED
     tic
     for i=1:itt
         C=dT*S;
     end
     disp(sprintf('Time: %f (Array Method Vectorized)',toc)); 
     C=reshape(C,ynum,xnum,ytessel,xtessel);

In any case, it seems like my code demonstrates that the use of cell and multi-dimensional arrays is inherently slow, thus even if the for-loop were somehow removed the inefficiency would remain.

No, the reason you see poor performance from arrays is specifically because of the way you use a loop to accumulate into sub-arrays. Specifically, in right-hand side expressions like,

C(:,:,yt,xt) + dT.*totemit(y+(yt-1)*nnum,x+(xt-1)*nnum);

you re-extract a sub-array C(:,:,yt,xt) from C repeatedly (nnum^2 times) and cause fresh memory to be allocated for this sub-array each time you do so. Your approach of using "independent variables" has less of this overhead because it does less right-hand side array indexing, however it is still not as optimal as the fully vectorized approach.

2 Commenti
Mostra Nessuno Nascondi Nessuno

Matt J il 9 Feb 2018

Apri in MATLAB Online

As a related point, notice how the "4D array with Loops" method improves when re-organized as follows

     tic
     for i=1:itt
         C = zeros(ynum,xnum,ytessel,xtessel);
         %Z=zeros(ynum,xnum,1,1);
         for yt=1:ytessel
             for xt=1:xtessel
                   TMP=0;%re-zero
                 for y=1:nnum
                   for x=1:nnum
                     dT           = dT_depot{y,x};
                     TMP = TMP + dT.*totemit(y+(yt-1)*nnum,x+(xt-1)*nnum);
                   end
                 end
                 C(:,:,yt,xt) = TMP;
             end
         end
     end
     disp(sprintf('Time: %f (4D Array Method with Loops - Version 2)',toc));

The variable TMP eliminates the right-hand-side indexing in the accumulation step in a manner similar to your "independent variables" approach. The compute time reduction is,

Time: 6.088266 (4D Array Method with Loops)
Time: 1.769762 (4D Array Method with Loops - Version 2)

Christopher il 9 Feb 2018

Apri in MATLAB Online

Wow, I appreciate this very much. Using the following code

 dT = zeros(ynum*xnum,nnum*nnum);
 for x=1:nnum
    for y=1:nnum
        dT(:,y+(x-1)*nnum) = rand(ynum*xnum,1);
    end
 end
 tmp = mat2cell(totemit,ones(1,ytessel)*nnum, ones(1,xtessel)*nnum);
 S   = reshape(cell2mat(tmp(:).'),nnum*nnum,[]);
 C   = zeros(ynum*xnum,ytessel*xtessel);
 tic
 for i=1:itt
    C  = C+dT*S;
 end
 toc

Recalculating timings for ytessel=4, xtessel=4, itt=5, and nnum=72:

 Time: 1.607546 (Matt J's Fully Vectorized)
 Time: 14.911492 (Cell Array Loop Method)
 Time: 8.121991 (Independent Variables Loop Method)
 Time: 76.033369 (4D Array Loop Method)
 Time: 15.636499 (Structure Arrays Loop Method)
 Time: 124.471730 (Eval Loop Method)

and taking timings from before for ytessel=4, xtessel=2, itt=5, and nnum=144 and adding new timings:

 Time: 325.490661 (Cell Array Loop Method)
 Time: 493.344934 (4D Array Loop Method)
 Time: 32.752160 (Independent Variables Loop Method)
 Time: 319.870357 (Structure Arrays Loop Method)
 Time: 12.203983 (Matt J's Fully Vectorized)

So here we are seeing about 2.5-5x speedups by full vectorization.

Accedi per commentare.

Performance of cell arrays, multi-dimensional arrays, structure arrays, and multiple variables

8 Commenti
Mostra 6 commenti meno recenti Nascondi 6 commenti meno recenti

Risposta accettata

2 Commenti
Mostra Nessuno Nascondi Nessuno

Più risposte (0)

Categorie

Prodotti

Tag

Community Treasure Hunt

Performance of cell arrays, multi-dimensional arrays, structure arrays, and multiple variables

8 Commenti Mostra 6 commenti meno recenti Nascondi 6 commenti meno recenti

Risposta accettata

2 Commenti Mostra Nessuno Nascondi Nessuno

Più risposte (0)

Categorie

Prodotti

Tag

Vedere anche

Community Treasure Hunt

8 Commenti
Mostra 6 commenti meno recenti Nascondi 6 commenti meno recenti

2 Commenti
Mostra Nessuno Nascondi Nessuno