# ClassificationLinear class

Linear model for binary classification of high-dimensional data

## Description

`ClassificationLinear` is a trained linear model object for binary classification; the linear model is a support vector machine (SVM) or logistic regression model. `fitclinear` fits a `ClassificationLinear` model by minimizing the objective function using techniques that reduce computation time for high-dimensional data sets (e.g., stochastic gradient descent). The classification loss plus the regularization term compose the objective function.

Unlike other classification models, and for economical memory usage, `ClassificationLinear` model objects do not store the training data. However, they do store, for example, the estimated linear model coefficients, prior-class probabilities, and the regularization strength.

You can use trained `ClassificationLinear` models to predict labels or classification scores for new data. For details, see `predict`.

## Construction

Create a `ClassificationLinear` object by using `fitclinear`.

## Properties

expand all

Linear Classification Properties

Regularization term strength, specified as a nonnegative scalar or vector of nonnegative values.

Data Types: `double` | `single`

Linear classification model type, specified as `'logistic'` or `'svm'`.

In this table, $f\left(x\right)=x\beta +b.$

• β is a vector of p coefficients.

• x is an observation from p predictor variables.

• b is the scalar bias.

ValueAlgorithmLoss Function`FittedLoss` Value
`'svm'`Support vector machineHinge: $\ell \left[y,f\left(x\right)\right]=\mathrm{max}\left[0,1-yf\left(x\right)\right]$`'hinge'`
`'logistic'`Logistic regressionDeviance (logistic): $\ell \left[y,f\left(x\right)\right]=\mathrm{log}\left\{1+\mathrm{exp}\left[-yf\left(x\right)\right]\right\}$`'logit'`

Linear coefficient estimates, specified as a numeric vector with length equal to the number of predictors.

Data Types: `double`

Estimated bias term or model intercept, specified as a numeric scalar.

Data Types: `double`

Loss function used to fit the linear model, specified as `'hinge'` or `'logit'`.

ValueAlgorithmLoss Function`Learner` Value
`'hinge'`Support vector machineHinge: $\ell \left[y,f\left(x\right)\right]=\mathrm{max}\left[0,1-yf\left(x\right)\right]$`'svm'`
`'logit'`Logistic regressionDeviance (logistic): $\ell \left[y,f\left(x\right)\right]=\mathrm{log}\left\{1+\mathrm{exp}\left[-yf\left(x\right)\right]\right\}$`'logistic'`

Complexity penalty type, specified as `'lasso (L1)'` or ```'ridge (L2)'```.

The software composes the objective function for minimization from the sum of the average loss function (see `FittedLoss`) and a regularization value from this table.

ValueDescription
`'lasso (L1)'`Lasso (L1) penalty: $\lambda \sum _{j=1}^{p}|{\beta }_{j}|$
`'ridge (L2)'`Ridge (L2) penalty: $\frac{\lambda }{2}\sum _{j=1}^{p}{\beta }_{j}^{2}$

λ specifies the regularization term strength (see `Lambda`).

The software excludes the bias term (β0) from the regularization penalty.

Other Classification Properties

Categorical predictor indices, specified as a vector of positive integers. `CategoricalPredictors` contains index values indicating that the corresponding predictors are categorical. The index values are between 1 and `p`, where `p` is the number of predictors used to train the model. If none of the predictors are categorical, then this property is empty (`[]`).

Data Types: `single` | `double`

Unique class labels used in training, specified as a categorical or character array, logical or numeric vector, or cell array of character vectors. `ClassNames` has the same data type as the class labels `Y`. (The software treats string arrays as cell arrays of character vectors.) `ClassNames` also determines the class order.

Data Types: `categorical` | `char` | `logical` | `single` | `double` | `cell`

Misclassification costs, specified as a square numeric matrix. `Cost` has K rows and columns, where K is the number of classes.

`Cost(i,j)` is the cost of classifying a point into class `j` if its true class is `i`. The order of the rows and columns of `Cost` corresponds to the order of the classes in `ClassNames`.

Data Types: `double`

Parameters used for training the `ClassificationLinear` model, specified as a structure.

Access fields of `ModelParameters` using dot notation. For example, access the relative tolerance on the linear coefficients and the bias term by using `Mdl.ModelParameters.BetaTolerance`.

Data Types: `struct`

Predictor names in order of their appearance in the predictor data, specified as a cell array of character vectors. The length of `PredictorNames` is equal to the number of variables in the training data `X` or `Tbl` used as predictor variables.

Data Types: `cell`

Expanded predictor names, specified as a cell array of character vectors.

If the model uses encoding for categorical variables, then `ExpandedPredictorNames` includes the names that describe the expanded variables. Otherwise, `ExpandedPredictorNames` is the same as `PredictorNames`.

Data Types: `cell`

Prior class probabilities, specified as a numeric vector. `Prior` has as many elements as classes in `ClassNames`, and the order of the elements corresponds to the elements of `ClassNames`.

Data Types: `double`

Response variable name, specified as a character vector.

Data Types: `char`

Score transformation function to apply to predicted scores, specified as a function name or function handle.

For linear classification models and before transformation, the predicted classification score for the observation x (row vector) is f(x) = xβ + b, where β and b correspond to `Mdl.Beta` and `Mdl.Bias`, respectively.

To change the score transformation function to, for example, `function`, use dot notation.

• For a built-in function, enter this code and replace `function` with a value in the table.

`Mdl.ScoreTransform = 'function';`

ValueDescription
`"doublelogit"`1/(1 + e–2x)
`"invlogit"`log(x / (1 – x))
`"ismax"`Sets the score for the class with the largest score to 1, and sets the scores for all other classes to 0
`"logit"`1/(1 + ex)
`"none"` or `"identity"`x (no transformation)
`"sign"`–1 for x < 0
0 for x = 0
1 for x > 0
`"symmetric"`2x – 1
`"symmetricismax"`Sets the score for the class with the largest score to 1, and sets the scores for all other classes to –1
`"symmetriclogit"`2/(1 + ex) – 1

• For a MATLAB® function, or a function that you define, enter its function handle.

`Mdl.ScoreTransform = @function;`

`function` must accept a matrix of the original scores for each class, and then return a matrix of the same size representing the transformed scores for each class.

Data Types: `char` | `function_handle`

## Object Functions

 `edge` Classification edge for linear classification models `incrementalLearner` Convert linear model for binary classification to incremental learner `lime` Local interpretable model-agnostic explanations (LIME) `loss` Classification loss for linear classification models `margin` Classification margins for linear classification models `partialDependence` Compute partial dependence `plotPartialDependence` Create partial dependence plot (PDP) and individual conditional expectation (ICE) plots `predict` Predict labels for linear classification models `shapley` Shapley values `selectModels` Choose subset of regularized, binary linear classification models `update` Update model parameters for code generation

## Copy Semantics

Value. To learn how value classes affect copy operations, see Copying Objects.

## Examples

collapse all

Train a binary, linear classification model using support vector machines, dual SGD, and ridge regularization.

`load nlpdata`

`X` is a sparse matrix of predictor data, and `Y` is a categorical vector of class labels. There are more than two classes in the data.

Identify the labels that correspond to the Statistics and Machine Learning Toolbox™ documentation web pages.

`Ystats = Y == 'stats';`

Train a binary, linear classification model that can identify whether the word counts in a documentation web page are from the Statistics and Machine Learning Toolbox™ documentation. Train the model using the entire data set. Determine how well the optimization algorithm fit the model to the data by extracting a fit summary.

```rng(1); % For reproducibility [Mdl,FitInfo] = fitclinear(X,Ystats)```
```Mdl = ClassificationLinear ResponseName: 'Y' ClassNames: [0 1] ScoreTransform: 'none' Beta: [34023x1 double] Bias: -1.0059 Lambda: 3.1674e-05 Learner: 'svm' Properties, Methods ```
```FitInfo = struct with fields: Lambda: 3.1674e-05 Objective: 5.3783e-04 PassLimit: 10 NumPasses: 10 BatchLimit: [] NumIterations: 238561 GradientNorm: NaN GradientTolerance: 0 RelativeChangeInBeta: 0.0562 BetaTolerance: 1.0000e-04 DeltaGradient: 1.4582 DeltaGradientTolerance: 1 TerminationCode: 0 TerminationStatus: {'Iteration limit exceeded.'} Alpha: [31572x1 double] History: [] FitTime: 0.1159 Solver: {'dual'} ```

`Mdl` is a `ClassificationLinear` model. You can pass `Mdl` and the training or new data to `loss` to inspect the in-sample classification error. Or, you can pass `Mdl` and new predictor data to `predict` to predict class labels for new observations.

`FitInfo` is a structure array containing, among other things, the termination status (`TerminationStatus`) and how long the solver took to fit the model to the data (`FitTime`). It is good practice to use `FitInfo` to determine whether optimization-termination measurements are satisfactory. Because training time is small, you can try to retrain the model, but increase the number of passes through the data. This can improve measures like `DeltaGradient`.

```load nlpdata n = size(X,1); % Number of observations```

Identify the labels that correspond to the Statistics and Machine Learning Toolbox™ documentation web pages.

`Ystats = Y == 'stats';`

Hold out 5% of the data.

```rng(1); % For reproducibility cvp = cvpartition(n,'Holdout',0.05)```
```cvp = Hold-out cross validation partition NumObservations: 31572 NumTestSets: 1 TrainSize: 29994 TestSize: 1578 ```

`cvp` is a `CVPartition` object that defines the random partition of n data into training and test sets.

Train a binary, linear classification model using the training set that can identify whether the word counts in a documentation web page are from the Statistics and Machine Learning Toolbox™ documentation. For faster training time, orient the predictor data matrix so that the observations are in columns.

```idxTrain = training(cvp); % Extract training set indices X = X'; Mdl = fitclinear(X(:,idxTrain),Ystats(idxTrain),'ObservationsIn','columns');```

Predict observations and classification error for the hold out sample.

```idxTest = test(cvp); % Extract test set indices labels = predict(Mdl,X(:,idxTest),'ObservationsIn','columns'); L = loss(Mdl,X(:,idxTest),Ystats(idxTest),'ObservationsIn','columns')```
```L = 7.1753e-04 ```

`Mdl` misclassifies fewer than 1% of the out-of-sample observations.

## Version History

Introduced in R2016a

expand all