Documentation

canoncorr

Canonical correlation

Syntax

[A,B] = canoncorr(X,Y)
[A,B,r] = canoncorr(X,Y)
[A,B,r,U,V] = canoncorr(X,Y)
[A,B,r,U,V,stats] = canoncorr(X,Y)

Description

[A,B] = canoncorr(X,Y) computes the sample canonical coefficients for the n-by-d1 and n-by-d2 data matrices X and Y. X and Y must have the same number of observations (rows) but can have different numbers of variables (columns). A and B are d1-by-d and d2-by-d matrices, where d = min(rank(X),rank(Y)). The jth columns of A and B contain the canonical coefficients, i.e., the linear combination of variables making up the jth canonical variable for X and Y, respectively. Columns of A and B are scaled to make the covariance matrices of the canonical variables the identity matrix (see U and V below). If X or Y is less than full rank, canoncorr gives a warning and returns zeros in the rows of A or B corresponding to dependent columns of X or Y.

[A,B,r] = canoncorr(X,Y) also returns a 1-by-d vector containing the sample canonical correlations. The jth element of r is the correlation between the jth columns of U and V (see below).

[A,B,r,U,V] = canoncorr(X,Y) also returns the canonical variables, scores. U and V are n-by-d matrices computed as

U = (X-repmat(mean(X),N,1))*A
V = (Y-repmat(mean(Y),N,1))*B

[A,B,r,U,V,stats] = canoncorr(X,Y) also returns a structure stats containing information relating to the sequence of d null hypotheses ${H}_{0}^{\left(k\right)}$, that the (k+1)st through dth correlations are all zero, for k = 0:(d-1). stats contains seven fields, each a 1-by-d vector with elements corresponding to the values of k, as described in the following table:

FieldDescription
Wilks

Wilks' lambda (likelihood ratio) statistic

df1

Degrees of freedom for the chi-squared statistic, and the numerator degrees of freedom for the F statistic

df2

Denominator degrees of freedom for the F statistic

F

Rao's approximate F statistic for ${H}_{0}^{\left(k\right)}$

pF

Right-tail significance level for F

chisq

Bartlett's approximate chi-squared statistic for ${H}_{0}^{\left(k\right)}$ with Lawley's modification

pChisq

Right-tail significance level for chisq

stats has two other fields (dfe and p) which are equal to df1 and pChisq, respectively, and exist for historical reasons.

Examples

collapse all

X = [Displacement Horsepower Weight Acceleration MPG];
nans = sum(isnan(X),2) > 0;

Compute the sample canonical correlation.

[A,B,r,U,V] = canoncorr(X(~nans,1:3),X(~nans,4:5));

Plot the canonical variables scores.

plot(U(:,1),V(:,1),'.')
xlabel('0.0025*Disp+0.020*HP-0.000025*Wgt')
ylabel('-0.17*Accel-0.092*MPG')

References

[1] Krzanowski, W. J. Principles of Multivariate Analysis: A User's Perspective. New York: Oxford University Press, 1988.

[2] Seber, G. A. F. Multivariate Observations. Hoboken, NJ: John Wiley & Sons, Inc., 1984.