Main Content

Define Custom Deep Learning Output Layers

Tip

This topic explains how to define custom deep learning output layers for your problems. For a list of built-in layers in Deep Learning Toolbox™, see List of Deep Learning Layers.

To learn how to define custom intermediate layers, see Define Custom Deep Learning Intermediate Layers.

If Deep Learning Toolbox does not provide the output layer that you require for your task, then you can define your own custom layer using this topic as a guide. After defining the custom layer, you can automatically check that the layer is valid, GPU compatible, and outputs correctly defined gradients.

Output Layer Architecture

At the end of a forward pass at training time, an output layer takes the predictions (network outputs) Y of the previous layer and calculates the loss L between these predictions and the training targets. The output layer computes the derivatives of the loss L with respect to the predictions Y and outputs (backward propagates) results to the previous layer.

The following figure describes the flow of data through a convolutional neural network and an output layer.

Output Layer Templates

To define a custom intermediate layer, use one of these class definition templates. The templates outline the structure of an output layer class definition. They outline:

Classification Output Layer Template

This template outlines the structure of a classification output layer with a loss function. For an example showing how to define a classification output layer and specify a loss function, see Define Custom Classification Output Layer.

classdef myClassificationLayer < nnet.layer.ClassificationLayer
        
    properties
        % (Optional) Layer properties.

        % Layer properties go here.
    end
 
    methods
        function layer = myClassificationLayer()           
            % (Optional) Create a myClassificationLayer.

            % Layer constructor function goes here.
        end

        function loss = forwardLoss(layer, Y, T)
            % Return the loss between the predictions Y and the training 
            % targets T.
            %
            % Inputs:
            %         layer - Output layer
            %         Y     – Predictions made by network
            %         T     – Training targets
            %
            % Output:
            %         loss  - Loss between Y and T

            % Layer forward loss function goes here.
        end
        
        function dLdY = backwardLoss(layer, Y, T)
            % (Optional) Backward propagate the derivative of the loss 
            % function.
            %
            % Inputs:
            %         layer - Output layer
            %         Y     – Predictions made by network
            %         T     – Training targets
            %
            % Output:
            %         dLdY  - Derivative of the loss with respect to the 
            %                 predictions Y

            % Layer backward loss function goes here.
        end
    end
end

Regression Output Layer Template

This template outlines the structure of a regression output layer with a loss function. For an example showing how to define a regression output layer and specify a loss function, see Define Custom Regression Output Layer.

classdef myRegressionLayer < nnet.layer.RegressionLayer
        
    properties
        % (Optional) Layer properties.

        % Layer properties go here.
    end
 
    methods
        function layer = myRegressionLayer()           
            % (Optional) Create a myRegressionLayer.

            % Layer constructor function goes here.
        end

        function loss = forwardLoss(layer, Y, T)
            % Return the loss between the predictions Y and the training
            % targets T.
            %
            % Inputs:
            %         layer - Output layer
            %         Y     – Predictions made by network
            %         T     – Training targets
            %
            % Output:
            %         loss  - Loss between Y and T

            % Layer forward loss function goes here.
        end
        
        function dLdY = backwardLoss(layer, Y, T)
            % (Optional) Backward propagate the derivative of the loss 
            % function.
            %
            % Inputs:
            %         layer - Output layer
            %         Y     – Predictions made by network
            %         T     – Training targets
            %
            % Output:
            %         dLdY  - Derivative of the loss with respect to the 
            %                 predictions Y        

            % Layer backward loss function goes here.
        end
    end
end

Output Layer Properties

Declare the layer properties in the properties section of the class definition.

By default, custom output layers have the following properties:

  • NameLayer name, specified as a character vector or a string scalar. For Layer array input, the trainNetwork, assembleNetwork, layerGraph, and dlnetwork functions automatically assign names to layers with Name set to ''.

  • Description – One-line description of the layer, specified as a character vector or a string scalar. This description appears when the layer is displayed in a Layer array. If you do not specify a layer description, then the software displays "Classification Output" or "Regression Output".

  • Type – Type of the layer, specified as a character vector or a string scalar. The value of Type appears when the layer is displayed in a Layer array. If you do not specify a layer type, then the software displays the layer class name.

Custom classification layers also have the following property:

  • ClassesClasses of the output layer, specified as a categorical vector, string array, cell array of character vectors, or 'auto'. If Classes is 'auto', then the software automatically sets the classes at training time. If you specify the string array or cell array of character vectors str, then the software sets the classes of the output layer to categorical(str,str).

Custom regression layers also have the following property:

  • ResponseNamesNames of the responses, specified a cell array of character vectors or a string array. At training time, the software automatically sets the response names according to the training data. The default is {}.

If the layer has no other properties, then you can omit the properties section.

Forward Loss Function

The output layer computes the loss L between predictions and targets using the forward loss function and computes the derivatives of the loss with respect to the predictions using the backward loss function.

The syntax for forwardLoss is loss = forwardLoss(layer,Y,T). The input Y corresponds to the predictions made by the network. These predictions are the output of the previous layer. The input T corresponds to the training targets. The output loss is the loss between Y and T according to the specified loss function. The output loss must be scalar.

Backward Loss Function

If the layer forward loss function supports dlarray objects, then the software automatically determines the backward loss function. For a list of functions that support dlarray objects, see List of Functions with dlarray Support. Alternatively, to define a custom backward loss function, create a function named backwardLoss. For an example showing how to define a custom backward loss function, see Specify Custom Output Layer Backward Loss Function.

The syntax for backwardLoss is dLdY = backwardLoss(layer,Y,T). The input Y contains the predictions made by the network and T contains the training targets. The output dLdY is the derivative of the loss with respect to the predictions Y. The output dLdY must be the same size as the layer input Y.

For classification problems, the dimensions of T depend on the type of problem.

Classification TaskInput SizeObservation Dimension
2-D image classification1-by-1-by-K-by-N, where K is the number of classes and N is the number of observations.4
3-D image classification1-by-1-by-1-by-K-by-N, where K is the number of classes and N is the number of observations.5
Sequence-to-label classificationK-by-N, where K is the number of classes and N is the number of observations.2
Sequence-to-sequence classificationK-by-N-by-S, where K is the number of classes, N is the number of observations, and S is the sequence length.2

The size of Y depends on the output of the previous layer. To ensure that Y is the same size as T, you must include a layer that outputs the correct size before the output layer. For example, to ensure that Y is a 4-D array of prediction scores for K classes, you can include a fully connected layer of size K followed by a softmax layer before the output layer.

For regression problems, the dimensions of T also depend on the type of problem.

Regression TaskInput SizeObservation Dimension
2-D image regression1-by-1-by-R-by-N, where R is the number of responses and N is the number of observations.4
2-D Image-to-image regressionh-by-w-by-c-by-N, where h, w, and c are the height, width, and number of channels of the output respectively, and N is the number of observations.4
3-D image regression1-by-1-by-1-by-R-by-N, where R is the number of responses and N is the number of observations.5
3-D Image-to-image regressionh-by-w-by-d-by-c-by-N, where h, w, d, and c are the height, width, depth, and number of channels of the output respectively, and N is the number of observations.5
Sequence-to-one regressionR-by-N, where R is the number of responses and N is the number of observations.2
Sequence-to-sequence regressionR-by-N-by-S, where R is the number of responses, N is the number of observations, and S is the sequence length.2

For example, if the network defines an image regression network with one response and has mini-batches of size 50, then T is a 4-D array of size 1-by-1-by-1-by-50.

The size of Y depends on the output of the previous layer. To ensure that Y is the same size as T, you must include a layer that outputs the correct size before the output layer. For example, for image regression with R responses, to ensure that Y is a 4-D array of the correct size, you can include a fully connected layer of size R before the output layer.

The forwardLoss and backwardLoss functions have the following output arguments.

FunctionOutput ArgumentDescription
forwardLosslossCalculated loss between the predictions Y and the true target T.
backwardLossdLdYDerivative of the loss with respect to the predictions Y.

The backwardLoss must output dLdY with the size expected by the previous layer and dLdY to be the same size as Y.

GPU Compatibility

If the layer forward functions fully support dlarray objects, then the layer is GPU compatible. Otherwise, to be GPU compatible, the layer functions must support inputs and return outputs of type gpuArray (Parallel Computing Toolbox).

Many MATLAB® built-in functions support gpuArray (Parallel Computing Toolbox) and dlarray input arguments. For a list of functions that support dlarray objects, see List of Functions with dlarray Support. For a list of functions that execute on a GPU, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox). To use a GPU for deep learning, you must also have a supported GPU device. For information on supported devices, see GPU Support by Release (Parallel Computing Toolbox). For more information on working with GPUs in MATLAB, see GPU Computing in MATLAB (Parallel Computing Toolbox).

Check Validity of Layer

If you create a custom deep learning layer, then you can use the checkLayer function to check that the layer is valid. The function checks layers for validity, GPU compatibility, correctly defined gradients, and code generation compatibility. To check that a layer is valid, run the following command:

checkLayer(layer,validInputSize)
where layer is an instance of the layer, validInputSize is a vector or cell array specifying the valid input sizes to the layer. To check with multiple observations, use the ObservationDimension option. To check for code generation compatibility, set the CheckCodegenCompatibility option to 1 (true). For large input sizes, the gradient checks take longer to run. To speed up the tests, specify a smaller valid input size.

For more information, see Check Custom Layer Validity.

See Also

| | | |

Related Topics