Vectorize a loop to save time

Question

Filip il 3 Feb 2019

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/442981-vectorize-a-loop-to-save-time

Commentato: Walter Roberson il 4 Feb 2019

I have a big data set and my current code takes 2 hours. I am hoping to save time by vectorization if that is possible in my case.

I have a table Table with variables ID, t1, tend, p. My code is sth like:

x=zeros(size(Table.ID,1));
for i=1:size(Table.ID,1)
x(i)=sum(Table.t1<Table.t1(i) & Table.tend>Table.tend(i) & abs(Table.p-Table.p(i))>1);
end

So for each observation, I want to find number of observations that start before, ends after and have a p value in the neighborhood of 1. It takes 2 hours to run this loop. Any suggestion?

Thanks in advance!

2 Commenti
Mostra NessunoNascondi Nessuno

Walter Roberson il 4 Feb 2019

How are the t1 and tend values arranged? Are tend(i+1) = t1(i) such that together they partition into consecutive ranges that are completely filled between the first and last? Do they act to partition into non-overlapping ranges but with gaps? Are there overlapping regions? Are the boundaries already sorted?

Filip il 4 Feb 2019

There is no arrangement between t1 and tend values across observations. They might overlap for some observations, there might be gaps in time too.

All I know is that t1<tend for an observation.

Table is sorted wrt ID.

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Jan il 4 Feb 2019

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/442981-vectorize-a-loop-to-save-time#answer_359428

Modificato: Jan il 4 Feb 2019

Apri in MATLAB Online

2 hours sounds long. Is the memory exhausted and the virtual memory slows down the execution? How large is the input?

Is this a typo:

x = zeros(size(Table.ID,1))

It creates a square matrix, but you access it as vector obly.

Does the table access need a remarkable amount of time?

n    = size(Table.ID,1);
t1   = Table.t1;
tend = Table.tend;
p    = Table.p;
x    = zeros(n, 1);
for i = 1:n
  x(i) = sum(t1 < t1(i) & tend > tend(i) & abs(p - p(i)) > 1);
end

If you sort one of the vectors, you could save some time:

[t1s, index] = sort(t1);
tends        = tend(index);
ps           = p(index);
for i = 2:n
  m    = t1s < t1s(i);
  x(i) = sum(tends(m) > tends(i) & ...
             abs(ps(m) - ps(i)) > 1);
end

Afterwards x has to be sorted inversly. If you provide some inputs, I could check the code before posting. I'm tired, perhaps I've overseen an obvious indexing error.

Is the shown code really the bottleneck of the original code?

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Filip il 4 Feb 2019

I have more variables in the table and do more comparisons, but they are all similar. So, I wrote a sample here to give the idea.

x = zeros(size(Table.ID,1)) is obviously a typo.

I guess, sorting t1 will work, and also accessing table might be time consuming. I will update when I apply the changes but this seems promising. Thanks!

Accedi per commentare.

Answer 2

Walter Roberson il 4 Feb 2019

1
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/442981-vectorize-a-loop-to-save-time#answer_359344

Apri in MATLAB Online

My mind is headed towards creating a pairwise mask matrix,

M = squareform(pdist(Table.p) > 1);    %important that Table.p is a column vector

That would be comparatively fast. If the table is very big then it could fill up memory, though.

abs() is not needed for this; pdist will already have calculated distance as a non-negative number.

Now

Mi = M(i,:);
x(i)=sum(Table.t1(Mi)<Table.t1(i) & Table.tend(Mi)>Table.tend(i));

However you should do timing tests against

Mi = M(i,:);
x(i)=sum(Mi & Table.t1<Table.t1(i) & Table.tend>Table.tend(i));

and

Mi = M(i,:);
Tt = Table(Mi);
x(i)=sum(Tt.t1<Table.t1(i) & Tt.tend>Table.tend(i));

2 Commenti
Mostra NessunoNascondi Nessuno

Filip il 4 Feb 2019

Unfortunately, this answer does not exactly work. But inspired by your answer, I believe that creating pairwise difference matrix by "bsxfun(@minus, T.t1, T.t1')" might work. I am not sure how faster it is gonna be and if I will have memory issues. I will try and update after.

Walter Roberson il 4 Feb 2019

Apri in MATLAB Online

abs(T.t1 - T.t1.')

would work as a distance function for you in R2016b and later.

Accedi per commentare.

Vectorize a loop to save time

2 Commenti
Mostra NessunoNascondi Nessuno

Risposta accettata

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Più risposte (1)

2 Commenti
Mostra NessunoNascondi Nessuno

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

Vectorize a loop to save time

2 Commenti Mostra NessunoNascondi Nessuno

Risposta accettata

1 Commento Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Più risposte (1)

2 Commenti Mostra NessunoNascondi Nessuno

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

2 Commenti
Mostra NessunoNascondi Nessuno

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

2 Commenti
Mostra NessunoNascondi Nessuno