How are the following methods to compute correlation different?

Question

Vitor Carvalho il 17 Ott 2022

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/1828568-how-are-the-following-methods-to-compute-correlation-different

Modificato: dpb il 17 Ott 2022

Hello everyone,

Until recently I was computing the correlation between two matrices in a quite inefficient way. This was my initial approach:

corr_mat = zeros(m,m);
% Consider T as a pre-populated matrix of dimensions (n x 2*m)
for i = 1:m
    for j = 1:m
        corr_mat = corr(T(:,i), T(:,(j+m));
    end
end

However, from my understanding of the description of the 'corr' function, the above line of code is equivalent to:

corr_mat = corr(T(:,1:m), T(:,(m+1):2*m));

I have tried both approaches and compared their results and it turns out they were different. However, upon generating matrices with random numbers and trying the same approach as above, I actually obtained the same results. Here is the test I made:

mat1 = randn(50,10);
mat2 = randn(50,10);
corr1 = zeros(10,10);
for i = 1:10
    for j = 1:10
        corr1(i, j) = corr(mat1(:,i), mat2(:,j));
    end
end
corr2 = corr(mat1, mat2); % Generates the same correlation matrix as corr1

This left me extremely confused, as to me, the above test is equivalent to first 2 scripts. Would someone be able to explain to me how the first two scripts differ from the above test and also (and most importantly) why the first 2 scripts are generating different correlation matrices?

Thank you in advance!

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

dpb il 17 Ott 2022

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/1828568-how-are-the-following-methods-to-compute-correlation-different#answer_1077618

Modificato: dpb il 17 Ott 2022

Apri in MATLAB Online

But if you create T from your two mat arrays as T=[mat1 mat2]: then the results are all the same; if you got something different it would be owing to the inputs being different and you didn't give any data to illustrate the first contention of getting a different result.

I made the two arrays smaller for convenience, but conclusions still hold...

m1=randn(50,5); m2=randn(50,5);
T=[m1 m2];
m=size(m1,2);
all(corr(m1,m2)==corr(T(:,1:m),T(:,m+1:2*m)),'all')
ans = logical
   1
% now the double loop result
for i=1:m,for j=1:m, c(i,j)=corr(T(:,i),T(:,j+m));end,end
all(c==corr(m1,m2),'all')
ans = logical
   0
% OOOOh...the identical test fails, but ---
max(diff(c-corr(m1,m2)),[],'all')
ans = 1.4225e-16

illustrates it's just rounding error at the double precision magnitude between the different routes to compute the numbers.

Moral -- just use the vectorized corr function...

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

How are the following methods to compute correlation different?

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposta accettata

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Più risposte (0)

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

How are the following methods to compute correlation different?

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposta accettata

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Più risposte (0)

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti