A linear regression model describes the relationship between a *dependent
variable*, *y*, and one or more *independent
variables*, *X*. The dependent variable is also called
the *response variable*. Independent variables are also called
*explanatory* or *predictor variables*.
Continuous predictor variables are also called *covariates*, and
categorical predictor variables are also called *factors*. The matrix
*X* of observations on predictor variables is usually called the
*design matrix*.

A multiple linear regression model is

$${y}_{i}={\beta}_{0}+{\beta}_{1}{X}_{i1}+{\beta}_{2}{X}_{i2}+\cdots +{\beta}_{p}{X}_{ip}+{\epsilon}_{i},\text{\hspace{1em}}i=1,\cdots ,n,$$

where

*y*is the_{i}*i*th response.*β*_{k}is the*k*th coefficient, where*β*_{0}is the constant term in the model. Sometimes, design matrices might include information about the constant term. However,`fitlm`

or`stepwiselm`

by default includes a constant term in the model, so you must not enter a column of 1s into your design matrix*X*.*X*is the_{ij}*i*th observation on the*j*th predictor variable,*j*= 1, ...,*p*.*ε*is the_{i}*i*th noise term, that is, random error.

If a model includes only one predictor variable (*p* = 1), then the model is called a simple linear regression model.

In general, a linear regression model can be a model of the form

$${y}_{i}={\beta}_{0}+{\displaystyle \sum _{k=1}^{K}{\beta}_{k}{f}_{k}\left({X}_{i1},{X}_{i2},\cdots ,{X}_{ip}\right)}+{\epsilon}_{i},\text{\hspace{1em}}i=1,\cdots ,n,$$

where *f* (.) is a scalar-valued function of the
independent variables,
*X** _{ij}*s. The functions,

Some examples of linear models are:

$$\begin{array}{l}{y}_{i}={\beta}_{0}+{\beta}_{1}{X}_{1i}+{\beta}_{2}{X}_{2i}+{\beta}_{3}{X}_{3i}+{\epsilon}_{i}\\ {y}_{i}={\beta}_{0}+{\beta}_{1}{X}_{1i}+{\beta}_{2}{X}_{2i}+{\beta}_{3}{X}_{1i}^{3}+{\beta}_{4}{X}_{2i}^{2}+{\epsilon}_{i}\\ {y}_{i}={\beta}_{0}+{\beta}_{1}{X}_{1i}+{\beta}_{2}{X}_{2i}+{\beta}_{3}{X}_{1i}{X}_{2i}+{\beta}_{4}\mathrm{log}{X}_{3i}+{\epsilon}_{i}\end{array}$$

The following, however, are not linear models since they are
not linear in the unknown coefficients, *β*_{k}.

$$\begin{array}{l}\mathrm{log}{y}_{i}={\beta}_{0}+{\beta}_{1}{X}_{1i}+{\beta}_{2}{X}_{2i}+{\epsilon}_{i}\\ {y}_{i}={\beta}_{0}+{\beta}_{1}{X}_{1i}+\frac{1}{{\beta}_{2}{X}_{2i}}+{e}^{{\beta}_{3}{X}_{1i}{X}_{2i}}+{\epsilon}_{i}\end{array}$$

The usual assumptions for linear regression models are:

The noise terms,

*ε*, are uncorrelated._{i}The noise terms,

*ε*_{i}, have independent and identical normal distributions with mean zero and constant variance, σ^{2}. Thus,$$\begin{array}{l}E\left({y}_{i}\right)=E\left({\displaystyle \sum _{k=0}^{K}{\beta}_{k}{f}_{k}\left({X}_{i1},{X}_{i2},\cdots ,{X}_{ip}\right)}+{\epsilon}_{i}\right)\\ \text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}={\displaystyle \sum _{k=0}^{K}{\beta}_{k}{f}_{k}\left({X}_{i1},{X}_{i2},\cdots ,{X}_{ip}\right)}+E\left({\epsilon}_{i}\right)\\ \text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}={\displaystyle \sum _{k=0}^{K}{\beta}_{k}{f}_{k}\left({X}_{i1},{X}_{i2},\cdots ,{X}_{ip}\right)}\end{array}$$

and

$$V\left({y}_{i}\right)=V\left({\displaystyle \sum _{k=0}^{K}{\beta}_{k}{f}_{k}\left({X}_{i1},{X}_{i2},\cdots ,{X}_{ip}\right)}+{\epsilon}_{i}\right)=V\left({\epsilon}_{i}\right)={\sigma}^{2}$$

So the variance of

*y*_{i}is the same for all levels of*X*_{ij}.The responses

*y*_{i}are uncorrelated.

The fitted linear function is

$${\widehat{y}}_{i}={\displaystyle \sum _{k=0}^{K}{b}_{k}{f}_{k}\left({X}_{i1},{X}_{i2},\cdots ,{X}_{ip}\right)},\text{\hspace{1em}}i=1,\cdots ,n,$$

where $${\widehat{y}}_{i}$$ is the estimated response and *b _{k}*s
are the fitted coefficients. The coefficients are estimated so as
to minimize the mean squared difference between the prediction vector $$\widehat{y}$$ and
the true response vector $$y$$,
that is $$\widehat{y}-y$$. This method is
called the

In a linear regression model of the form *y* =
*β*_{1}*X*_{1}
+*
β*_{2}*X*_{2} +
... +
*β*_{p}X_{p},
the coefficient *β*_{k}
expresses the impact of a one-unit change in predictor variable,
*X _{j}*, on the mean of the response
E(

[1] Neter, J., M. H. Kutner, C. J. Nachtsheim, and W. Wasserman. *Applied
Linear Statistical Models*. IRWIN, The McGraw-Hill Companies,
Inc., 1996.

[2] Seber, G. A. F. *Linear Regression Analysis*.
Wiley Series in Probability and Mathematical Statistics. John Wiley
and Sons, Inc., 1977.

`LinearModel`

| `fitlm`

| `stepwiselm`