how to calculate hamming distance between vectors in matrix

i try this code as:
a = [1 0 1 0 1;
0 1 1 1 0;
1 1 0 0 1];
D = pdist(a,'minkowski',1)
the answer came as: 4 2 4 while it should be : 0 4 4 how to solve it? and to make run for bigger matrix as (50x30)

1 Commento

firstly change minkowski to hamming then may be you will get correct answer.

Accedi per commentare.

 Risposta accettata

the cyclist
the cyclist il 19 Ott 2014
Modificato: the cyclist il 19 Ott 2014
Well, this doesn't give your expected output, but
D = pdist(a,'hamming')
gives the Hamming distance between each pair of rows.
I'm not sure why you used the input argument "minkowski".
You can see details in the documentation.

12 Commenti

it didn't work it give : 0.8000 0.4000 0.8000 instead of 0 4 4
OK. A couple of comments.
First, I see that for Minkowski distance, the third input argument is used, so I deleted that part of my answer.
Second, note that according to the documentation, the output will be the average distance, not the total. So, multiplying by 5 (because there are 5 points in each vector), we get
5*D = [4 2 4]
According to the Wikipedia page on Hamming distance, this is exactly what I would expect. It is the number of positions at which the vectors differ.
Why do expect the answer to be [0 4 4]?
jim
jim il 19 Ott 2014
Modificato: jim il 19 Ott 2014
because for the first vector is zero case its vector one, the second vector = v2-v1 and third vector = v3 - v2
the cyclist
the cyclist il 19 Ott 2014
Modificato: the cyclist il 19 Ott 2014
I would say that you simply did not read the documentation carefully enough.
MATLAB is not reporting the following (which is what you expected):
  • D(v1,v1)
  • D(v1,v2)
  • D(v2,v3)
It is reporting every pairwise distance (except for self-distance).
  • D(v1,v2)
  • D(v1,v3)
  • D(v2,v3)
It is straightforward to pluck the distances you need from the output.
so how to manipulate the code to make it run as D(v1,v1) D(v1,v2) D(v2,v3)
I don't understand the pattern in the distances you want, out of the possible choices, so I don't understand how to generalize it to the larger matrix.
Maybe you could try it yourself, and post your code here if you get stuck?
a= [1 0 1 0 1;0 1 1 1 0;1 1 0 0 1];
%C = nchoosek(1:4,2);
% D = sum(a(C(:,2),:)~=a(C(:,1),:),2)';
D = pdist(a,'minkowski',1)
%D = dec2bin (pdist2(a,b,'hamming'))
%D = pdist(a,'minkowski',1)
%D = pdist(a)
%h_d = sum(abs(a-a))
%D = dec2bin(pdist(a,'hamming'))
% hamming_dist = @(a,b)sum(a(:)~=b(:));
% hamming_dist(eye(3), zeros(3))
i try it but it didn't go..
I am just trying to understand which distance pairs you want, in general. For example, if a is 4 rows (4 vectors) instead of 3, which pairs do you want to compare?
pdist will give you
  • D(v1,v2)
  • D(v1,v3)
  • D(v1,v4)
  • D(v2,v3)
  • D(v2,v4)
  • D(v3,v4)
What do you actually want as the output from a 4-row matrix?
D(V1-V1) D(V2-V1) D(V3-V2) D(V4-V3) and so on for the rest of the matrix rows. any row with its previous row, and row one should give zero.
I think the following code does what you want:
a = [1 0 1 0 1;
0 1 1 1 0;
1 1 0 0 1];
D = pdist(a,'minkowski',1);
D_matrix = squareform(D);
consecutive_rows_D = [0; diag(D_matrix,1)];
Note that I use the squareform function (as mentioned in the documentation for pdist) to create a matrix form of the distances, and then the diag function to pull the values of that matrix at positions
  • (1,2)
  • (2,3)
  • (3,4)
  • and so on if there were more rows
Finally, I tag on the initial zero by hand.
jim
jim il 21 Ott 2014
Modificato: jim il 21 Ott 2014
it works right now... thanks alot
The best form of thanks to accept the answer, which indicates to others (who may have a similar problem) that this resolved the question you posed.

Accedi per commentare.

Più risposte (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by