vectorisation of a loop

the matrix is "Sea" of m*n dimension .. this is the code i would like to vectorize :
=======================
S = size(Sea);
i = 1;
while i < S(1)
indices = find ( pdist2( Sea( i , : ),Sea( i+1:S(1) , :)) <0.05 );
Sea(indices , : ) = [];
S = size(Sea);
i = i+1;
end
===========
desc : i'm trying to calculate the distance between each row and all the other rows in the matrix and delete the ones that are closer than 0.05

Risposte (2)

Matt J
Matt J il 30 Nov 2021
Modificato: Matt J il 1 Dic 2021
[m,n]=size(Sea);
D=pdist2(Sea,Sea);
D(1:m+1:end)=inf;
indices=any(triu(D<0.05),1);
Sea(indices,:)=[];

12 Commenti

Zaid B
Zaid B il 30 Nov 2021
thank you for answering , but i want to delete the raw from sea matrix directly
Matt J
Matt J il 30 Nov 2021
The code I've shown does that.
Zaid B
Zaid B il 30 Nov 2021
it gives back an array of indices not doing anything on the sea matrix
Perhaps this is what you want.
[m,n]=size(Sea);
D=pdist2(Sea,Sea);
D(1:m+1:end)=inf;
indices=any(D<0.05,1);
Sea(indices,:)=[];
Zaid B
Zaid B il 1 Dic 2021
it deletes all the matrix at once , thank you for your help
Matt J
Matt J il 1 Dic 2021
You should attach some data in a .mat file, so the two versions can be compared.
Zaid B
Zaid B il 1 Dic 2021
its simple data (matrix of 16 million * 14 singles) ,so you can generate a random mtrix m*14 of sibgles and execute the codes on it, you will see what im getting as results
Matt J
Matt J il 2 Dic 2021
Modificato: Matt J il 2 Dic 2021
Is the idea to end up with points in Sea that are all at least 0.05 apart? If so then, as you can see below, my version succeeds at this, whereas your original version does not:
Sea=rand(1000,2);
Sea1=version1(Sea);
Sea2=version2(Sea);
checkmin(Sea1) %original version
ans = 0.0016
checkmin(Sea2) %proposed version
ans = 0.0508
function val=checkmin(Sea)
D=pdist2(Sea,Sea);
val=min(D(D>0));
end
function Sea=version1(Sea)
S = size(Sea);
i = 1;
while i < S(1)
indices = find ( pdist2( Sea( i , : ),Sea( i+1:S(1) , :)) <0.05 );
Sea(indices , : ) = [];
S = size(Sea);
i = i+1;
end
end
function Sea=version2(Sea)
[m,n]=size(Sea);
D=pdist2(Sea,Sea);
D(1:m+1:end)=inf;
indices=any(triu(D<0.05),1);
Sea(indices,:)=[];
end
Zaid B
Zaid B il 2 Dic 2021
Modificato: Zaid B il 2 Dic 2021
"Is the idea to end up with points in Sea that are all at least 0.05 apart?" ...the idea is somehow as you said but your version cause lost of information , i want to minimise the size of the matrix as following;
i test the first row distance to all the other rows and delete rows that are closer than 0.05 but keeping the first row
then , after deleting the other rows i test the second row from the new SEA(after deleting the rows from the previous iteration) with all the 3:end rows (always keeping the current row and deleting the closest rows)
i hope you got the idea of my code!
thank you for your continious help ^^
Matt J
Matt J il 2 Dic 2021
Modificato: Matt J il 2 Dic 2021
That's also not what your current code is doing. If that's what you want, you need,
indices = i + find ( pdist2( Sea( i , : ),Sea( i+1:S(1) , :)) <0.05 ) ;
Zaid B
Zaid B il 2 Dic 2021
Modificato: Zaid B il 2 Dic 2021
thank you for answering , but why ( i - 1 ) is added on that line?!
("Moreover, I don't think there is any more optimization that can be done" ... i thought so , but wanted to check if anyone has a better idea that would help)
Zaid B
Zaid B il 2 Dic 2021
Modificato: Zaid B il 2 Dic 2021
i got it why is nedded there but i think "i + find(...)" not "(i-1)+find(..)" because it should keep the current i , thank you that was a great help there ^^

Accedi per commentare.

Matt J
Matt J il 2 Dic 2021
Modificato: Matt J il 2 Dic 2021
I find this version to be about 30% faster:
S = size(Sea,1);
i = 1;
t=0.05^2;
while i < S
indices=true(1,S);
indices(i+1:end) = sum( abs( Sea( i , : )-Sea( i+1:S , :) ).^2,2) >= t ;
Sea=Sea(indices , : );
S = size(Sea,1);
i = i+1;
end

13 Commenti

Zaid B
Zaid B il 4 Dic 2021
Modificato: Zaid B il 4 Dic 2021
Sea = rand(20000,14);
tic
P1 = version1(Sea,1); %my proposed version
checkmin(P1)
ans = 0.2857
time_V1 = toc
time_V1 = 19.5087
tic
P2 = version2(Sea,1); %your proposed version
checkmin(P2)
ans = 0.2857
time_V2 = toc
time_V2 = 23.8950
% TRUE = (P1 == P2)
function sea = version1(m,step)
S = size(m);
% SeaCopy= [];
i = 1;
while i < S(1)
% SeaCopy = [SeaCopy ; m(i,:)];
indices = i + find(pdist2(m(i,:),m(i+1:S(1),:))<0.05);
if(length(indices) >= 1)
% SeaCopy = [SeaCopy ; mean(m(indices,:))];
m(indices,:)=[];
end
S = size(m);
i = i +step;
end
sea = m;
end
function sea = version2(m,step)
S = size(m);
i = 1;
t=0.05^2;
while i < S(1)
indices=true(1,S(1));
indices(i+1:end) = sum( abs(m( i , : )-m( i+1:S(1) , :) ).^2,2) >= t ;
m=m(indices , : );
S = size(m,1);
i = i+1;
end
sea = m;
end
function val=checkmin(Sea)
D=pdist2(Sea,Sea);
val=min(D(D>0));
end
Matt J
Matt J il 4 Dic 2021
Modificato: Matt J il 4 Dic 2021
Sea = rand(8000,14);
tic
P1 = version1(Sea,1); %your proposed version
time_V1 = toc
time_V1 = 2.2475
tic
P3= version3(Sea,1); %my proposed version
time_V3 = toc
time_V3 = 1.0464
isequal(P1,P3)
ans = logical
1
function sea = version1(m,step)
S = size(m);
% SeaCopy= [];
i = 1;
while i < S(1)
% SeaCopy = [SeaCopy ; m(i,:)];
indices = i + find(pdist2(m(i,:),m(i+1:S(1),:))<0.05);
if(length(indices) >= 1)
% SeaCopy = [SeaCopy ; mean(m(indices,:))];
m(indices,:)=[];
end
S = size(m);
i = i +step;
end
sea = m;
end
function sea = version3(m,step)
S = size(m,1);
i = 1;
t=0.05^2;
while i < S
indices=false(1,S);
indices(i+1:end) = sum( abs(m( i , : )-m( i+1:S , :) ).^2,2) < t ;
m(indices , : )=[];
S = size(m,1);
i = i+1;
end
sea = m;
end
Matt J
Matt J il 4 Dic 2021
Modificato: Matt J il 4 Dic 2021
Note that some of these comparisons will be Matlab release dependent. In R2018a, I find that version 2 is faster than version 1, but still slower than version 3.
Zaid B
Zaid B il 5 Dic 2021
version 3 is not doing the work needed ,after checking the matrix returned from version 3 , it's not equal to the same matrix returned by pervious versions (v1,v2)
Sea = rand(10000,14);
tic
P1 = version1(Sea,1); %my version always faster
ans = 1×2
9647 14
checkmin(P1)
ans = 0.5000
V1 = toc
V1 = 7.7813
tic
P2 = version2(Sea,1);% your version slighly slower but doing the job needed
checkmin(P2)
ans = 0.5000
V2 = toc
V2 = 11.0393
tic
P3 = version3(Sea,1); %verision three you proposed
checkmin(P3)
ans = 0.3358
V3 = toc
V3 = 4.1707
TRUE = isequal(P1,P3) %cheking if its the same matrix returned from previous versions
TRUE = logical
0
function sea = version1(m,step)
S = size(m);
SeaCopy= [];
i = 1;
while i < S(1)
SeaCopy = [SeaCopy ; m(i,:)];
indices = i + find(pdist2(m(i,:),m(i+1:S(1),:))<0.5);
if(length(indices) > 1)
SeaCopy = [SeaCopy ; mean(m(indices,:))];
m(indices,:)=[];
else
m(indices,:)=[];
end
S = size(m);
i = i +step;
end
size(SeaCopy)
sea = m;
end
function sea = version2(m,step)
SeaCopy= [];
S = size(m);
i = 1;
t=0.5^2;
while i < S(1)
SeaCopy = [SeaCopy ; m(i,:)];
indices=true(1,S(1));
indices(i+1:end) = sum( abs(m( i , : )-m( i+1:S(1) , :) ).^2,2) >= t ;
SeaCopy = [SeaCopy ; mean(m(indices,:))];
m = m(indices , : );
S = size(m,1);
i = i+1;
end
sea = m;
end
function sea = version3(m,step)
S = size(m,1);
i = 1;
t=0.05^2;
while i < S
indices=false(1,S);
indices(i+1:end) = sum( abs(m( i , : )-m( i+1:S , :) ).^2,2) < t ;
m(indices , : )=[];
S = size(m,1);
i = i+1;
end
sea = m;
end
function val=checkmin(Sea)
D=pdist2(Sea,Sea);
val=min(D(D>0));
end
Matt J
Matt J il 5 Dic 2021
Modificato: Matt J il 5 Dic 2021
Something must have been miscopied, because I'm getting identical results. Below is a comparison of all 3 versions, showing they are equal.
Sea = rand(8000,14);
tic
P1 = version1(Sea,1);
time_V1 = toc
time_V1 = 2.5249
tic
P2 = version2(Sea,1);
time_V2 = toc
time_V2 = 2.5527
tic
P3 = version3(Sea,1);
time_V3 = toc
time_V3 = 1.2148
isequal(P2,P1)
ans = logical
1
isequal(P3,P1)
ans = logical
1
function sea = version1(m,step)
S = size(m);
% SeaCopy= [];
i = 1;
while i < S(1)
% SeaCopy = [SeaCopy ; m(i,:)];
indices = i + find(pdist2(m(i,:),m(i+1:S(1),:))<0.05);
if(length(indices) >= 1)
% SeaCopy = [SeaCopy ; mean(m(indices,:))];
m(indices,:)=[];
end
S = size(m);
i = i +step;
end
sea = m;
end
function sea = version2(m,step)
S = size(m);
i = 1;
t=0.05^2;
while i < S(1)
indices=true(1,S(1));
indices(i+1:end) = sum( abs(m( i , : )-m( i+1:S(1) , :) ).^2,2) >= t ;
m=m(indices , : );
S = size(m,1);
i = i+1;
end
sea = m;
end
function sea = version3(m,step)
S = size(m,1);
i = 1;
t=0.05^2;
while i < S
indices=false(1,S);
indices(i+1:end) = sum( abs(m( i , : )-m( i+1:S , :) ).^2,2) < t ;
m(indices , : )=[];
S = size(m,1);
i = i+1;
end
sea = m;
end
Zaid B
Zaid B il 5 Dic 2021
run the checkmin function ... since its rand generated matrix they might be no changes in matrix so they return the same input matrix .... if you run the check min version you should get in v1 & 2 ans = 0.0500 and the third one something else
to see what i mean you can print the size of each output returned by versions
Matt J
Matt J il 5 Dic 2021
Modificato: Matt J il 5 Dic 2021
There is no way the results can be different, either in size or data content, if the isequal() command says they are not. Just to confirm this for you, though, here are my results with checkmin:
Sea = rand(10000,14);
tic
P1 = version1(Sea,1); %my proposed version
time_V1 = toc
time_V1 = 3.4304
tic
P2 = version2(Sea,1); %your proposed version
time_V2 = toc
time_V2 = 4.1817
tic
P3 = version3(Sea,1); %your proposed version
time_V3 = toc
time_V3 = 2.2365
isequal(P1,P2)
ans = logical
1
isequal(P1,P3)
ans = logical
1
checkmin(P1)
ans = 0.3305
checkmin(P2)
ans = 0.3305
checkmin(P3)
ans = 0.3305
Zaid B
Zaid B il 5 Dic 2021
yeah that means that the genrated samples are >0.05 since the begining .. run the sizes for each version it will be more clear for you
Fine, but we can do the same comparison with less trivial input data:
Sea = rand(2e4,14)/10;
tic
P1 = version1(Sea,1);
time_V1 = toc
time_V1 = 9.2412
tic
P2 = version2(Sea,1);
time_V2 = toc
time_V2 = 13.1732
tic
P3 = version3(Sea,1);
time_V3 = toc
time_V3 = 6.8440
isequal(P1,P2)
ans = logical
1
isequal(P2,P3)
ans = logical
1
size(P1)
ans = 1×2
18604 14
size(P2)
ans = 1×2
18604 14
size(P3)
ans = 1×2
18604 14
Zaid B
Zaid B il 6 Dic 2021
can we parallel it if possible???
that might help speed up the calculations !
Matt J
Matt J il 6 Dic 2021
Modificato: Matt J il 6 Dic 2021
If you have the Parallel Computing Toolbox, then making Sea a gpuArray should add some speed.
Zaid B
Zaid B il 6 Dic 2021
yeah i do , i ll try it and post the results later

Accedi per commentare.

Categorie

Scopri di più su Graphics Performance in Centro assistenza e File Exchange

Richiesto:

il 30 Nov 2021

Modificato:

il 6 Dic 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by