how to avoid for loop to increase speed

Question

Bertil Veenstra il 4 Gen 2023

1
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/1888327-how-to-avoid-for-loop-to-increase-speed

Commentato: Bertil Veenstra il 6 Gen 2023

I would like to calculate the mean value of vector A for every sample as defined in vector B. So, the first point of the resulting vector C is the mean of the first 9 (=B(1)) datapoints of A. The second point of C is the mean of the following 10 (=B(2) datapoints of A. Etc. The following code works, but takes time when processing large vectors:

A=rand(91,1); % vector with  91 random samples
B=[9,10,10,8,11,10,10,9,10,6]; % vector with the number of samples 
C= zeros(length(B),1); % preallocate C
first=1;
for i = 1: length(C)
    last = first+B(i)-1;
    interval=(first:last);
    C(i) = mean(A(interval));
    first =last+1;
end

Is there a way to use B in an index of A, instead of using this for loop?

2 Commenti
Mostra NessunoNascondi Nessuno

Matt J il 4 Gen 2023

Apri in MATLAB Online

Shouldn't sum(B) equal length(A)?

A=rand(91,1); % vector with  91 random samples
B=[9,10,10,8,11,10,10,9,10,6]; % vector with the number of samples 
sum(B)
ans = 93
length(A)
ans = 91

Bertil Veenstra il 4 Gen 2023

Yes, you are right. They should be the same length

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Matt J il 4 Gen 2023

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/1888327-how-to-avoid-for-loop-to-increase-speed#answer_1141312

Modificato: Matt J il 5 Gen 2023

Apri in MATLAB Online

A=rand(91,1); % vector with  91 random samples
B=[9,10,10,8,11,10,10,9,10,6]; % vector with the number of samples 
G=repelem(1:numel(B), B);
n=min(numel(A),numel(G));
G=G(:); A=A(:);
C=accumarray(G(1:n),A(1:n))./accumarray(G(1:n),1); %NOTE: faster than accumarray(G,A,[],@mean)

4 Commenti
Mostra 2 commenti meno recentiNascondi 2 commenti meno recenti

Matt J il 5 Gen 2023

Modificato: Matt J il 5 Gen 2023

Apri in MATLAB Online

Better to avoid two ACCUMARRAY() calls and specify the function instead:

That would be quite a bit slower, unfortunately. I very deliberately avoided it for that reason:

A=rand(900000,1); % vector with  91 random samples
B=ones(1,90000)*10; % vector with the number of samples 
assert(sum(B)==numel(A))
G=repelem(1:numel(B), B);
n=min(numel(A),numel(G));
G=G(:); A=A(:);
tic;
C=accumarray(G(1:n),A(1:n))./accumarray(G(1:n),1);
toc
Elapsed time is 0.022540 seconds.
tic;
C=accumarray(G(1:n),A(1:n),[],@mean); % simpler and faster
toc
Elapsed time is 0.333534 seconds.

Stephen23 il 5 Gen 2023

"That would be quite a bit slower, unfortunately. I very deliberately avoided it for that reason:"

Aah, that is a shame. Perhaps a code comment would help to make that choice clear.

Accedi per commentare.

Answer 2

Mathieu NOE il 4 Gen 2023

1
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/1888327-how-to-avoid-for-loop-to-increase-speed#answer_1141352

Apri in MATLAB Online

hello

had to change A to 93 samples so that it matches with sum(B) = 93

see my suggestion below. On this small data we can see an iùprovement of factor 4

Elapsed time is 0.004598 seconds (your code)

Elapsed time is 0.001204 seconds. (my code)

delta = 1.0e-14 *

0

-0.0222

-0.0111

-0.0333

-0.0555

0.1110

0.0777

-0.0222

-0.0444

-0.0222

wonder if that is going to be even better for larger vectors ?

A=rand(93,1); % vector with  93 random samples
B=[9,10,10,8,11,10,10,9,10,6]; % vector with the number of samples 
C= zeros(length(B),1); % preallocate C
first=1;
tic 
 for i = 1: length(C)
     last = first+B(i)-1;
     interval=(first:last);
     C(i) = mean(A(interval));
     first =last+1;
 end
 
 toc
 
% alternative code 
tic 
As = cumsum(A(:));
Bs = cumsum(B(:));
As = As(Bs);
Cs = [As(1); diff(As)]./B(:);
toc
delta = C - Cs
plot(C,'-*b')
hold on
plot(Cs,'dr')
hold off

7 Commenti
Mostra 5 commenti meno recentiNascondi 5 commenti meno recenti

Mathieu NOE il 5 Gen 2023

Modificato: Mathieu NOE il 5 Gen 2023

Apri in MATLAB Online

Just for fun I compared the speed of my solution vs Matt's code

here tested on much larger vectors (1100063 samples) , still my solution is by far better , but is numerically less accurate as Matt's code . (we are talking relative error below 10^-10)

Original code : Elapsed time is 0.449751 seconds.

My suggestion : Elapsed time is 0.011151 seconds.

delta_max = 3.4435e-11

Matt suggestion : Elapsed time is 0.058991 seconds.

delta_max = 0

% original code
B = 8 + randi(5,100000,1);
samples = sum(B);
A=rand(samples,1); % vector with  1100063 random samples
C= zeros(length(B),1); % preallocate C
first=1;
tic 
 for i = 1: length(C)
     last = first+B(i)-1;
     interval=(first:last);
     C(i) = mean(A(interval));
     first =last+1;
 end
  toc
 
% alternative code # 1 (me)
tic 
As = cumsum(A(:));
Bs = cumsum(B(:));
As = As(Bs);
Cs = [As(1); diff(As)]./B(:);
toc
delta_max = max(abs(C - Cs))
% alternative code # 2 (Matt J)
tic 
G=repelem(1:numel(B), B);
n=min(numel(A),numel(G));
G=G(:); A=A(:);
Cs=accumarray(G(1:n),A(1:n))./accumarray(G(1:n),1);
toc
delta_max = max(abs(C - Cs))

Matt J il 6 Gen 2023

Modificato: Matt J il 6 Gen 2023

Apri in MATLAB Online

@Stephen23 already showed you earlier that you can use accumarray to apply any function to the blocks.

A = randi(5,1,9300); % vector with random integers ranging from 1 to 5
B=ones(1,930)*10; % vector the number of samples 
tic
C= zeros(length(B),1); % preallocate C
first=1;
 for i = 1: length(C)
     last = first+B(i)-1;
     interval=(first:last);
     C(i) = mode(A(interval));
     first =last+1;
 end
 toc
Elapsed time is 0.032246 seconds.
 
 tic
  G=repelem((1:numel(B)),B);
  C=accumarray(G(:),A(:),[],@mode);
 toc
Elapsed time is 0.019723 seconds.

It is to be expected that speed-up is more modest, unfortunately. Accumarray isn't as well optimized for arbitrary functions.

Bertil Veenstra il 6 Gen 2023

Thanks you Matt for this code. It doesns't seem to run faster than the for loop, but it did learn me more about writing and understanding matlab code.

Accedi per commentare.

Answer 3

Matt J il 6 Gen 2023

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/1888327-how-to-avoid-for-loop-to-increase-speed#answer_1142867

Modificato: Matt J il 6 Gen 2023

Apri in MATLAB Online

And instead of the mean, I am interested in the mode.

This method should offer speed-up for a generic function, provided that it can ignore NaNs and provided the blocks don't vary too greatly in length:

B = randi([5,10],1,9300);
A = randi(5,1,sum(B));
discrepancy = max( abs(loopMethod(A,B)-altMethod(A,B)),[],'all')
discrepancy = 0
timeit(@() loopMethod(A,B))
ans = 0.1231
timeit(@() altMethod(A,B))
ans = 0.0027
function C=loopMethod(A,B)
    C= zeros(1,length(B)); % preallocate C
    first=1;
    
     for i = 1: length(C)
         last = first+B(i)-1;
         interval=(first:last);
         C(i) = mode(A(interval));
         first =last+1;
     end
end
 
function C=altMethod(A,B)
     bmax=max(B);
     
     I=(1:bmax)'<=B;
     
     T=nan(size(I));
     T(I)=A(:);
     
     C=mode(T,1);
     
end

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Bertil Veenstra il 6 Gen 2023

Hi Matt, this works nice. On larger files I get an 3-4 fold increase in speed. Smart solution. Thanks for your time.

Accedi per commentare.

how to avoid for loop to increase speed

2 Commenti
Mostra NessunoNascondi Nessuno

Risposta accettata

4 Commenti
Mostra 2 commenti meno recentiNascondi 2 commenti meno recenti

Più risposte (2)

7 Commenti
Mostra 5 commenti meno recentiNascondi 5 commenti meno recenti

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

how to avoid for loop to increase speed

2 Commenti Mostra NessunoNascondi Nessuno

Risposta accettata

4 Commenti Mostra 2 commenti meno recentiNascondi 2 commenti meno recenti

Più risposte (2)

7 Commenti Mostra 5 commenti meno recentiNascondi 5 commenti meno recenti

1 Commento Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

2 Commenti
Mostra NessunoNascondi Nessuno

4 Commenti
Mostra 2 commenti meno recentiNascondi 2 commenti meno recenti

7 Commenti
Mostra 5 commenti meno recentiNascondi 5 commenti meno recenti

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti