Note: This page has been translated by MathWorks. Click here to see

To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

The hat matrix provides a measure of leverage. It is useful
for investigating whether one or more observations are outlying with
regard to their *X* values, and therefore might be
excessively influencing the regression results.

The hat matrix is also known as the *projection matrix* because
it projects the vector of observations, y, onto the vector of predictions, $$\widehat{y}$$, thus putting the "hat" on
y. The hat matrix *H* is defined in terms of the
data matrix *X*:

*H* = *X*(*X ^{T}X*)

and determines the fitted or predicted values since

$$\widehat{y}=Hy=Xb.$$

The diagonal elements of *H*, *h*_{ii},
are called leverages and satisfy

$$\begin{array}{l}0\le {h}_{ii}\le 1\\ {\displaystyle \sum _{i=1}^{n}{h}_{ii}}=p,\end{array}$$

where *p* is the number of coefficients, and *n* is
the number of observations (rows of *X*) in the regression
model. `HatMatrix`

is an *n*-by-*n* matrix
in the `Diagnostics`

table.

After obtaining a fitted model, say, `mdl`

,
using `fitlm`

or `stepwiselm`

, you
can:

Display the

`HatMatrix`

by indexing into the property using dot notationWhenmdl.Diagnostics.HatMatrix

*n*is large,`HatMatrix`

might be computationally expensive. In those cases, you can obtain the diagonal values directly, usingmdl.Diagnostics.Leverage

Leverage is a measure of the effect of a particular observation
on the regression predictions due to the position of that observation
in the space of the inputs. In general, the farther a point is from
the center of the input space, the more leverage it has. Because the
sum of the leverage values is *p*, an observation *i* can
be considered as an outlier if its leverage substantially exceeds
the mean leverage value, *p*/*n*,
for example, a value larger than 2**p*/*n*.

The leverage of observation *i* is the value
of the *i*th diagonal term, *h*_{ii},
of the hat matrix, *H*, where

*H* = *X*(*X*^{T}*X*)^{–1}*X*^{T}.

$$\begin{array}{l}0\le {h}_{ii}\le 1\\ {\displaystyle \sum _{i=1}^{n}{h}_{ii}}=p,\end{array}$$

where *p* is the
number of coefficients in the regression model, and *n* is
the number of observations. The minimum value of *h*_{ii} is
1/*n* for a model with a constant term. If the fitted
model goes through the origin, then the minimum leverage value is
0 for an observation at *x* = 0.

It is possible to express the fitted values, $$\widehat{y}$$, by the observed values, *y*,
since

$$\widehat{y}=Hy=Xb.$$

Hence, *h*_{ii} expresses
how much the observation *y _{i}* has
impact on $${\widehat{y}}_{i}$$. A large value of

`Leverage`

is
an `Diagnostics`

table.After obtaining a fitted model, say, `mdl`

,
using `fitlm`

or `stepwiselm`

, you
can:

Display the

`Leverage`

vector by indexing into the property using dot notationmdl.Diagnostics.Leverage

Plot the leverage for the values fitted by your model using

See theplotDiagnostics(mdl)

`plotDiagnostics`

method of the`LinearModel`

class for details.

This example shows how to compute `Leverage`

values and assess high leverage observations. Load the sample data and define the response and independent variables.

```
load hospital
y = hospital.BloodPressure(:,1);
X = double(hospital(:,2:5));
```

Fit a linear regression model.

mdl = fitlm(X,y);

Plot the leverage values.

plotDiagnostics(mdl)

For this example, the recommended threshold value is 2*5/100 = 0.1. There is no indication of high leverage observations.

`LinearModel`

| `fitlm`

| `plotDiagnostics`

| `stepwiselm`