Why is subscripted assignment so inefficient?
    6 visualizzazioni (ultimi 30 giorni)
  
       Mostra commenti meno recenti
    
    Kenneth Johnson
      
 il 5 Feb 2022
  
    
    
    
    
    Commentato: Walter Roberson
      
      
 il 6 Feb 2022
            Why is MATLAB's subscripted assignment so abysmally inefficient with more than 2 subscripts? e.g.,
>> clear, a = []; tic, for j = 1e3:-1:1, for k = 1e3:-1:1, a(j,k,1) = 1; end, end, toc
Elapsed time is 0.193743 seconds.
>> clear, a = []; tic, for j = 1e3:-1:1, for k = 1e3:-1:1, a(j,k) = 1; end, end, toc
Elapsed time is 0.004241 seconds.
1 Commento
  Walter Roberson
      
      
 il 5 Feb 2022
				Interesting. I thought for a moment that it might be due to the extra multiplication and addition needed each time, but when I rewrite the code in terms of linear indexing as-if a 3D array, then I do not see nearly that much performance drop.
Risposta accettata
  Matt J
      
      
 il 5 Feb 2022
        
      Modificato: Matt J
      
      
 il 5 Feb 2022
  
      I would guess it's only because you are using 3 subscripts on an array that only has 2 dimensions. This forces Matlab to to do extra work, namely running a check to make sure that in a(j,k,m) the 3rd subscript is m=1 and simplifying the subscripts from (j,k,1) to (j,k). When a actualy does have 3 dimensions, the times are much more comparable:
a=zeros(1e3,1e3,2);
tic, 
for j = 1e3:-1:1
    for k = 1e3:-1:1 
        a(j,k,1) = 1; 
    end 
end 
toc
a=zeros(1e3,1e3);
tic, 
for j = 1e3:-1:1
    for k = 1e3:-1:1 
        a(j,k) = 1; 
    end 
end
toc
7 Commenti
  Walter Roberson
      
      
 il 6 Feb 2022
				Unfortunately I am running into time limits a lot in exploring the timing of allocations. With linear increasing allocations, if we assume approximately linear time, then we run into the 1t+2t+3t+4t+...n*t = t*n*(n+1)/2 problem, that the time needed to explore the suspected linear relation in a linear manner, requires squared time. I think I might need to switch to exponential increases in memory sizes to probe more closely.
N = 4;
G = 6;
start = datetime('now');
%timing pushed into a function to allow for potential JIT
t = run_timing(N, G);
stop = datetime('now');
elapsed = seconds(stop - start)
plot(t);
xlabel('try #'); ylabel('seconds')
legend((1:G) + "G")
title('absolute time')
tmean = mean(t,1);
plot(tmean)
xlabel(' gigabytes'); ylabel('mean seconds')
p = polyfit(1:G, tmean, 1);
fprintf('roughly %g seconds + %g seconds per gigabyte\n', p(2), p(1))
function t = run_timing(N,G)
    t = zeros(N,G);
    for g = 1 : G
        for n = 1 : N
            tic();
            initgig(g);
            t(n,g) = toc;
        end
    end
end
%gigabyte 1073741824
%megabyte 1048576
function initgig(M)
    a = ones(1,1073741824*M,'uint8');
end
  Walter Roberson
      
      
 il 6 Feb 2022
				There is a lot of noise in these measurements, and it is difficult to tell what is meaningful or not. 
I skip the first few runs in the plot because in nearly all runs there is a bad peak early on.
The second plot here is the more interesting one, showing the log of how the time to allocate increases with amount requested. Noise affects it a lot, so I can only describe trends that I encountered a fair bit:
- time to allocate 1 byte starts off a bit higher
- time to allocate a quite small number of bytes but more than 1, falls a bit, then rises a bit.
- there does seem to be a peak at 256 bytes
- and then it does seem to fall a bit
- until at 2^14 there seems to be a distinct rise. This would correspond to 16384 bytes. I suspect that either 4096 or 16384 bytes is the size of an entry in the "small block pool"
- in some of the plots, there was a leveling from 2^14 to 2^16; the more fine-grained I draw the less I see this, so it might have been a statistical artificat
- from 2^16 onward, the log grows pretty much linearly as log2() of number of bytes increases. To phrase that a different way: from 2^16 onward, the allocation time grows linearly with number of bytes allocated. The actual boundary might possibly be 2^14
- so, below 2^14, there appear to be at least two different allocation strategies, but the boundaries between them are a bit difficult to find. If there were only one allocation strategy for that range, using a free block pool with entries of size 2^14, then you would expect the time to be constant up to that point, but there do seem to be peaks.
- I seem to recall that 256 bytes is the limit below which for some constant allocations and some colon expressions, that MATLAB keeps a copy around as an optimization, to be handed out when encountering another request with the same spelling (spacing even being important!). Perhaps that is why we see a blip at 256??
If someone cares to do a study of how small block allocation is done in MATLAB, I suggest 2^14 as the upper limit to study (and remember to take into account those new hidden copies that MATLAB recently starting taking for small constants.)
N = 50;
G = 2^24;
start = datetime('now');
%timing pushed into a function to allow for potential JIT
[t, gvals] = run_timing(N, G);
stop = datetime('now');
elapsed = seconds(stop - start);
display(elapsed)
figure(1)
%tfilt = filloutliers(t, 'center','movmedian', 3, 1);
startat = 6;
semilogy(startat:N, t(startat:end,:));
xlabel('try #'); ylabel('seconds')
title('absolute time')
%legend("2^{"+string(gvals)+"}", 'NumColumns',4);
figure(2)
tmean = mean(t(startat:end,:),1);
semilogy(gvals, tmean)
xt = 1:ceil(max(gvals));
xticks(xt)
xticklabels("2^{"+string(xt)+"}");
xtickangle(45)
xlabel('bytes')
ylabel('mean seconds')
plin = polyfit(2.^gvals, tmean, 1);
%plog = polyfit(log2(gvals), log2(tmean),1);
fprintf('linear: roughly %g seconds + %g seconds per gigabyte\n', plin(2), plin(1)*2^30);
%fprintf('log: roughly %g seconds + %g seconds per gigabyte\n', 2.^plog(2), plog(1));
function [t,gvals] = run_timing(N,G)
    %wb = waitbar(0, '0G');
    %cleanMe = onCleanup(@()delete(wb));
    gvals = 0:.1:ceil(log2(G));
    ng = length(gvals);
    t = zeros(N,ng);
    for g = 1 : ng
        gp = gvals(g);
        gv = round(2.^gp);
        %waitbar(0, wb, "2^{"+string(gp)+"}");
        for n = 1 : N
            tic();
            initmem(gv);
            t(n,g) = toc;
            %waitbar(n/N, wb);
        end
    end
end
%gigabyte 1073741824
%megabyte 1048576
function initmem(M)
    ma = ones(1,M,'uint8');
end
Più risposte (0)
Vedere anche
Categorie
				Scopri di più su Startup and Shutdown in Help Center e File Exchange
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!











