PCA output: coefficients vs loadings

Question

Mathew Guilfoyle il 11 Feb 2013

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/63022-pca-output-coefficients-vs-loadings

Modificato: Seung Yi Lee il 30 Ago 2021

I would be grateful for some explanation on the output of principal components analysis (pca) from the Statistics Toolbox.

I have a dataset with 150 variables and ~50000 observations.

When I submit this to PCA there is one dominant PC/latent variable that accounts for >95% of the variance.

However, the first column of the output coefficient matrix has very low values for the loading of all the original variables (~0.06). My understanding is that the sum of squared loadings (i.e. the sum of squares of each column of the coefficient matrix) should equal the eigenvalues corresponding to each PC. However the sum(coeff.^2) shows 1 for all columns. This leads me to suspect that the loadings for each column are being scaled?

If I put the same data into SPSS I get the same eigenvalues/% explained but the component loadings on PC1 are now between 0.7 and 0.95.

Could anyone explain why and how these outputs differ?

Thanks

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Juyeong Choi il 21 Dic 2014

1
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/63022-pca-output-coefficients-vs-loadings#answer_162986

So, how do we calculate the loading for the PC1 as obtained in SPSS? Is there anyone who has an idea?

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Yuchun Zhou il 8 Lug 2019

Hi, do you know how to convert eventually?

Accedi per commentare.

Answer 2

Xiaosha Wang il 31 Lug 2015

1
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/63022-pca-output-coefficients-vs-loadings#answer_187809

The output of matlab is coefficient matrix, whereas the output of SPSS is loadings, defined as the correlation between a given principle component and the original variable. The two outputs (coefficient and loadings) are proportional.

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Answer 3

the cyclist il 11 Feb 2013

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/63022-pca-output-coefficients-vs-loadings#answer_74615

Modificato: the cyclist il 13 Feb 2013

Disclaimer: I am not an expert on PCA. [EDIT: Proof of this is that I was wrong that MATLAB scales. See Ilya's answer, and my comment to my own answer, below.]

I believe that this difference is due to the fact that MATLAB first "centers and scales" the original data into z-scores. I am guessing that differences in the loadings are going to be related to that transformation. (Maybe a scaling factor of the standard deviation of each variable?)

The wikipedia page ( http://en.wikipedia.org/wiki/Principal_component_analysis ) is a good resource. The second paragraph has a brief discussion of the scaling.

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

the cyclist il 13 Feb 2013

Matthew, did you ever resolve this? As Ilya pointed out, I was mistaken that MATLAB also scales the data to a z-score. It may be that SPSS does scale. I could not find definitive documentation online about this. I did see that SAS seems to do the scaling automatically. (It's often a good idea to scale, especially if your variables have very different magnitudes.)

Accedi per commentare.

Answer 4

Ilya il 11 Feb 2013

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/63022-pca-output-coefficients-vs-loadings#answer_74643

The princomp and pca functions center the data but do not scale. (In addition, pca allows not to center.)

The easiest way to understand PCA is using eigenvalue decomposition of the covariance matrix Sigma:

Sigma = V*Lambda*V'

Lambda is the diagonal matrix of eigenvalues. V is an orthonormal matrix of coefficients. Orthonormality implies that the 2-norm of every column is 1.

This is what the MATLAB implementation does. I am not familiar with the SPSS implementation.

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Answer 5

Seung Yi Lee il 30 Ago 2021

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/63022-pca-output-coefficients-vs-loadings#answer_777369

Modificato: Seung Yi Lee il 30 Ago 2021

Many years later of the original question posted, I ran into the same problem then figured out.

Coefficient (loading) is scaled by their corresponding egienvalue. Correcting them into the unscaled loading worked for me by using the equation below.

unscaled_loading = coeff.*sqrt(latent)'

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

PCA output: coefficients vs loadings

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposte (5)

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Vedere anche

Categorie

Tag

Community Treasure Hunt

PCA output: coefficients vs loadings

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposte (5)

1 Commento Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

1 Commento Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Vedere anche

Categorie

Tag

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti