Note: This page has been translated by MathWorks. Click here to see

To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

To construct a classification output layer with cross entropy loss for *k* mutually exclusive classes, use `classificationLayer`

. If you want to use a different loss function for your classification problems, then you can define a custom classification output layer using this example as a guide.

This example shows how to define and create a custom weighted classification output layer with weighted cross entropy loss. Use a weighted classification layer for classification problems with an imbalanced distribution of classes. For an example showing how to use a weighted classification layer in a network, see Speech Command Recognition Using Deep Learning.

To define a custom classification output layer, you can use the template provided in this example, which takes you through the following steps:

Name the layer – Give the layer a name so it can be used in MATLAB

^{®}.Declare the layer properties – Specify the properties of the layer.

Create a constructor function – Specify how to construct the layer and initialize its properties. If you do not specify a constructor function, then the software initializes the properties with

`''`

at creation.Create a forward loss function – Specify the loss between the predictions and the training targets.

Create a backward loss function – Specify the derivative of the loss with respect to the predictions.

A weighted classification layer computes the weighted cross
entropy loss for classification problems. Weighted cross entropy is an error measure between two continuous random
variables. For prediction scores *Y* and training targets *T*,
the weighted cross entropy loss between *Y* and *T* is given by

$$L=-\frac{1}{N}{\displaystyle \sum _{n=1}^{N}{\displaystyle \sum}_{i=1}^{K}}\text{}{w}_{i}{T}_{ni}\mathrm{log}({Y}_{ni}),$$

where *N* is the number of observations,
*K* is the number of classes, and *w* is a vector of
weights for each class.

Copy the classification output layer template into a new file in MATLAB. This template outlines the structure of a classification output layer and includes the functions that define the layer behavior.

classdef myClassificationLayer < nnet.layer.ClassificationLayer properties % (Optional) Layer properties. % Layer properties go here. end methods function layer = myClassificationLayer() % (Optional) Create a myClassificationLayer. % Layer constructor function goes here. end function loss = forwardLoss(layer, Y, T) % Return the loss between the predictions Y and the % training targets T. % % Inputs: % layer - Output layer % Y – Predictions made by network % T – Training targets % % Output: % loss - Loss between Y and T % Layer forward loss function goes here. end function dLdY = backwardLoss(layer, Y, T) % Backward propagate the derivative of the loss function. % % Inputs: % layer - Output layer % Y – Predictions made by network % T – Training targets % % Output: % dLdY - Derivative of the loss with respect to the predictions Y % Layer backward loss function goes here. end end end

First, give the layer a name. In the first line of the class file, replace the
existing name `myClassificationLayer`

with
`weightedClassificationLayer`

.

classdef weightedClassificationLayer < nnet.layer.ClassificationLayer ... end

Next, rename the `myClassificationLayer`

constructor function (the
first function in the `methods`

section) so that it has the same name
as the layer.

methods function layer = weightedClassificationLayer() ... end ... end

Save the layer class file in a new file named
`weightedClassificationLayer.m`

. The file name must match the
layer name. To use the layer, you must save the file in the current folder or in a
folder on the MATLAB path.

Declare the layer properties in the `properties`

section.

By default, custom output layers have the following properties:

`Name`

– Layer name, specified as a character vector or a string scalar. To include a layer in a layer graph, you must specify a nonempty unique layer name. If you train a series network with the layer and`Name`

is set to`''`

, then the software automatically assigns a name to the layer at training time.`Description`

– One-line description of the layer, specified as a character vector or a string scalar. This description appears when the layer is displayed in a`Layer`

array. If you do not specify a layer description, then the software displays`"Classification Output"`

or`"Regression Output"`

.`Type`

– Type of the layer, specified as a character vector or a string scalar. The value of`Type`

appears when the layer is displayed in a`Layer`

array. If you do not specify a layer type, then the software displays the layer class name.

Custom classification layers also have the following property:

`Classes`

– Classes of the output layer, specified as a categorical vector, string array, cell array of character vectors, or`'auto'`

. If`Classes`

is`'auto'`

, then the software automatically sets the classes at training time. If you specify the string array or cell array of character vectors`str`

, then the software sets the classes of the output layer to`categorical(str,str)`

. The default value is`'auto'`

.

Custom regression layers also have the following property:

`ResponseNames`

– Names of the responses, specified a cell array of character vectors or a string array. At training time, the software automatically sets the response names according to the training data. The default is`{}`

.

If the layer has no other properties, then you can omit the `properties`

section.

In this example, the layer requires an additional property to save the class weights.
Specify the property `ClassWeights`

in the
`properties`

section.

```
properties
% Vector of weights corresponding to the classes in the training
% data
ClassWeights
end
```

Create the function that constructs the layer and initializes the layer properties. Specify any variables required to create the layer as inputs to the constructor function.

Specify input argument `classWeights`

to assign to the
`ClassWeights`

property. Also specify an optional input argument
`name`

to assign to the `Name`

property at
creation. Add a comment to the top of the function that explains the syntaxes of the
function.

function layer = weightedClassificationLayer(classWeights, name) % layer = weightedClassificationLayer(classWeights) creates a % weighted cross entropy loss layer. classWeights is a row % vector of weights corresponding to the classes in the order % that they appear in the training data. % % layer = weightedClassificationLayer(classWeights, name) % additionally specifies the layer name. ... end

Replace the comment `% Layer constructor function goes here`

with
code that initializes the layer properties.

Give the layer a one-line description by setting the
`Description`

property of the layer. Set the
`Name`

property to the optional input argument
`name`

.

```
function layer = weightedClassificationLayer(classWeights, name)
% layer = weightedClassificationLayer(classWeights) creates a
% weighted cross entropy loss layer. classWeights is a row
% vector of weights corresponding to the classes in the order
% that they appear in the training data.
%
% layer = weightedClassificationLayer(classWeights, name)
% additionally specifies the layer name.
% Set class weights
layer.ClassWeights = classWeights;
% Set layer name
if nargin == 2
layer.Name = name;
end
% Set layer description
layer.Description = 'Weighted cross entropy';
end
```

Create a function named `forwardLoss`

that returns the weighted cross
entropy loss between the predictions made by the network and the training targets. The
syntax for `forwardLoss`

is ```
loss = forwardLoss(layer, Y,
T)
```

, where `Y`

is the output of the previous layer and
`T`

represents the training targets.

For classification problems, the dimensions of `T`

depend on the type of
problem.

Classification Task | Input Size | Observation Dimension |
---|---|---|

2-D image classification | 1-by-1-by-K-by-N, where
K is the number of classes and
N is the number of observations. | 4 |

3-D image classification | 1-by-1-by-1-by-K-by-N, where
K is the number of classes and
N is the number of observations. | 5 |

Sequence-to-label classification | K-by-N, where
K is the number of classes and
N is the number of observations. | 2 |

Sequence-to-sequence classification | K-by-N-by-S,
where K is the number of classes,
N is the number of observations, and
S is the sequence length. | 2 |

The size of `Y`

depends on the output of the previous layer. To ensure that
`Y`

is the same size as `T`

, you must include a layer
that outputs the correct size before the output layer. For example, to ensure that
`Y`

is a 4-D array of prediction scores for *K*
classes, you can include a fully connected layer of size *K* followed by a
softmax layer before the output layer.

A weighted classification layer computes the weighted cross
entropy loss for classification problems. Weighted cross entropy is an error measure between two continuous random
variables. For prediction scores *Y* and training targets *T*,
the weighted cross entropy loss between *Y* and *T* is given by

$$L=-\frac{1}{N}{\displaystyle \sum _{n=1}^{N}{\displaystyle \sum}_{i=1}^{K}}\text{}{w}_{i}{T}_{ni}\mathrm{log}({Y}_{ni}),$$

where *N* is the number of observations,
*K* is the number of classes, and *w* is a vector of
weights for each class.

The inputs `Y`

and `T`

correspond to
*Y* and *T* in the equation, respectively. The
output `loss`

corresponds to *L*. Add a comment to the
top of the function that explains the syntaxes of the function.

```
function loss = forwardLoss(layer, Y, T)
% loss = forwardLoss(layer, Y, T) returns the weighted cross
% entropy loss between the predictions Y and the training
% targets T.
N = size(Y,4);
Y = squeeze(Y);
T = squeeze(T);
W = layer.ClassWeights;
loss = -sum(W*(T.*log(Y)))/N;
end
```

Create the backward loss function.

Create a function named `backwardLoss`

that returns the derivatives
of the weighted cross entropy loss with respect to the predictions `Y`

.
The syntax for `backwardLoss`

is ```
loss = backwardLoss(layer, Y,
T)
```

, where `Y`

is the output of the previous layer and
`T`

represents the training targets.

The dimensions of `Y`

and `T`

are the same as the
inputs in `forwardLoss`

.

The derivative of the weighted cross entropy loss with respect to the predictions
*Y* is given by

$$\frac{\delta L}{\delta {Y}_{i}}=-\frac{1}{N}\frac{{w}_{i}{T}_{i}}{{Y}_{i}},$$

where *N* is the number of observations and
*w* is a vector of weights for each class. Add a comment to the top
of the function that explains the syntaxes of the function.

```
function dLdY = backwardLoss(layer, Y, T)
% dLdY = backwardLoss(layer, Y, T) returns the derivatives of
% the weighted cross entropy loss with respect to the
% predictions Y.
[~,~,K,N] = size(Y);
Y = squeeze(Y);
T = squeeze(T);
W = layer.ClassWeights;
dLdY = -(W'.*T./Y)/N;
dLdY = reshape(dLdY,[1 1 K N]);
end
```

View the completed classification output layer class file.

classdef weightedClassificationLayer < nnet.layer.ClassificationLayer properties % Vector of weights corresponding to the classes in the training % data ClassWeights end methods function layer = weightedClassificationLayer(classWeights, name) % layer = weightedClassificationLayer(classWeights) creates a % weighted cross entropy loss layer. classWeights is a row % vector of weights corresponding to the classes in the order % that they appear in the training data. % % layer = weightedClassificationLayer(classWeights, name) % additionally specifies the layer name. % Set class weights layer.ClassWeights = classWeights; % Set layer name if nargin == 2 layer.Name = name; end % Set layer description layer.Description = 'Weighted cross entropy'; end function loss = forwardLoss(layer, Y, T) % loss = forwardLoss(layer, Y, T) returns the weighted cross % entropy loss between the predictions Y and the training % targets T. N = size(Y,4); Y = squeeze(Y); T = squeeze(T); W = layer.ClassWeights; loss = -sum(W*(T.*log(Y)))/N; end function dLdY = backwardLoss(layer, Y, T) % dLdY = backwardLoss(layer, Y, T) returns the derivatives of % the weighted cross entropy loss with respect to the % predictions Y. [~,~,K,N] = size(Y); Y = squeeze(Y); T = squeeze(T); W = layer.ClassWeights; dLdY = -(W'.*T./Y)/N; dLdY = reshape(dLdY,[1 1 K N]); end end end

For GPU compatibility, the layer functions must support inputs
and return outputs of type `gpuArray`

. Any other functions the layer uses
must do the same. Many MATLAB built-in functions support `gpuArray`

input arguments. If you call any of these functions with at least one
`gpuArray`

input, then the function executes on the GPU and returns a
`gpuArray`

output. For a list of functions that execute on a GPU, see
Run MATLAB Functions on a GPU (Parallel Computing Toolbox). To use a GPU for deep
learning, you must also have a CUDA^{®} enabled NVIDIA^{®} GPU with compute capability 3.0 or higher. For more information on working with GPUs in MATLAB, see GPU Computing in MATLAB (Parallel Computing Toolbox).

The MATLAB functions used in `forwardLoss`

and
`backwardLoss`

in `weightedClassificationLayer`

all support `gpuArray`

inputs, so the layer is GPU compatible.

Check the validity of the custom classification output layer `weightedClassificationLayer`

.

Define a custom weighted classification layer. To create this layer, save the file `weightedClassificationLayer.m`

in the current folder.

Create an instance of the layer. Specify the class weights as a vector with three elements corresponding to three classes.

classWeights = [0.1 0.7 0.2]; layer = weightedClassificationLayer(classWeights);

Check that the layer is valid using `checkLayer`

. Set the valid input size to the typical size of a single observation input to the layer. The layer expects a 1-by-1-by-*K*-by-*N* array input, where *K* is the number of classes and *N* is the number of observations in the mini-batch.

```
numClasses = numel(classWeights);
validInputSize = [1 1 numClasses];
checkLayer(layer,validInputSize,'ObservationDimension',4);
```

Skipping GPU tests. No compatible GPU device found. Running nnet.checklayer.OutputLayerTestCase .......... ... Done nnet.checklayer.OutputLayerTestCase __________ Test Summary: 13 Passed, 0 Failed, 0 Incomplete, 4 Skipped. Time elapsed: 0.34878 seconds.

The test summary reports the number of passed, failed, incomplete, and skipped tests.

`assembleNetwork`

| `checkLayer`

| `classificationLayer`