## Define Custom Deep Learning Layer for Code Generation

If Deep Learning Toolbox™ does not provide the layer you require for your classification or regression problem, then you can define your own custom layer using this example as a guide. For a list of built-in layers, see List of Deep Learning Layers.

To define a custom deep learning layer, you can use the template provided in this example, which takes you through the following steps:

1. Name the layer – Give the layer a name so that you can use it in MATLAB®.

2. Declare the layer properties – Specify the properties of the layer and which parameters are learned during training.

3. Create a constructor function (optional) – Specify how to construct the layer and initialize its properties. If you do not specify a constructor function, then at creation, the software initializes the `Name`, `Description`, and `Type` properties with `[]` and sets the number of layer inputs and outputs to 1.

4. Create forward functions – Specify how data passes forward through the layer (forward propagation) at prediction time and at training time.

5. Create a backward function (optional) – Specify the derivatives of the loss with respect to the input data and the learnable parameters (backward propagation). If you do not specify a backward function, then the forward functions must support `dlarray` objects.

To create a custom layer that supports code generation:

• The layer must specify the pragma `%#codegen` in the layer definition.

• The inputs of `predict` must be:

• Consistent in dimension. Each input must have the same number of dimensions.

• Consistent in batch size. Each input must have the same batch size.

• The outputs of `predict` must be consistent in dimension and batch size with the layer inputs.

• Nonscalar properties must have type single, double, or character array.

• Scalar properties must have type numeric, logical, or string.

Code generation supports intermediate layers with 2-D image input only.

This example shows how to create a PReLU layer [1], which is a layer with a learnable parameter, and use it in a convolutional neural network. A PReLU layer performs a threshold operation, where for each channel, any input value less than zero is multiplied by a scalar learned at training time. For values less than zero, a PReLU layer applies scaling coefficients ${\alpha }_{i}$ to each channel of the input. These coefficients form a learnable parameter, which the layer learns during training.

This figure from [1] compares the ReLU and PReLU layer functions.

### Layer with Learnable Parameters Template

Copy the layer with learnable parameters template into a new file in MATLAB. This template outlines the structure of a layer with learnable parameters and includes the functions that define the layer behavior.

```classdef myLayer < nnet.layer.Layer % & nnet.layer.Formattable (Optional) properties % (Optional) Layer properties. % Layer properties go here. end properties (Learnable) % (Optional) Layer learnable parameters. % Layer learnable parameters go here. end methods function layer = myLayer() % (Optional) Create a myLayer. % This function must have the same name as the class. % Layer constructor function goes here. end function [Z1, …, Zm] = predict(layer, X1, …, Xn) % Forward input data through the layer at prediction time and % output the result. % % Inputs: % layer - Layer to forward propagate through % X1, ..., Xn - Input data % Outputs: % Z1, ..., Zm - Outputs of layer forward function % Layer forward function for prediction goes here. end function [Z1, …, Zm, memory] = forward(layer, X1, …, Xn) % (Optional) Forward input data through the layer at training % time and output the result and a memory value. % % Inputs: % layer - Layer to forward propagate through % X1, ..., Xn - Input data % Outputs: % Z1, ..., Zm - Outputs of layer forward function % memory - Memory value for custom backward propagation % Layer forward function for training goes here. end function [dLdX1, …, dLdXn, dLdW1, …, dLdWk] = ... backward(layer, X1, …, Xn, Z1, …, Zm, dLdZ1, …, dLdZm, memory) % (Optional) Backward propagate the derivative of the loss % function through the layer. % % Inputs: % layer - Layer to backward propagate through % X1, ..., Xn - Input data % Z1, ..., Zm - Outputs of layer forward function % dLdZ1, ..., dLdZm - Gradients propagated from the next layers % memory - Memory value from forward function % Outputs: % dLdX1, ..., dLdXn - Derivatives of the loss with respect to the % inputs % dLdW1, ..., dLdWk - Derivatives of the loss with respect to each % learnable parameter % Layer backward function goes here. end end end```

### Name Layer

First, give the layer a name. In the first line of the class file, replace the existing name `myLayer` with `codegenPreluLayer` and add a comment describing the layer.

```classdef codegenPreluLayer < nnet.layer.Layer % Example custom PReLU layer with codegen support. ... end```

Next, rename the `myLayer` constructor function (the first function in the `methods` section) so that it has the same name as the layer.

``` methods function layer = codegenPreluLayer() ... end ... end```

#### Save Layer

Save the layer class file in a new file named `codegenPreluLayer.m`. The file name must match the layer name. To use the layer, you must save the file in the current folder or in a folder on the MATLAB path.

### Specify Code Generation Pragma

Add the `%#codegen` directive (or pragma) to your layer definition to indicate that you intend to generate code for this layer. Adding this directive instructs the MATLAB Code Analyzer to help you diagnose and fix violations that result in errors during code generation.

```classdef codegenPreluLayer < nnet.layer.Layer % Example custom PReLU layer with codegen support. %#codegen ... end```

### Declare Properties and Learnable Parameters

Declare the layer properties in the `properties` section and declare learnable parameters by listing them in the `properties (Learnable)` section.

By default, custom intermediate layers have these properties.

PropertyDescription
`Name`Layer name, specified as a character vector or a string scalar. To include a layer in a layer graph, you must specify a nonempty, unique layer name. If you train a series network with the layer and `Name` is set to `''`, then the software automatically assigns a name to the layer at training time.
`Description`

One-line description of the layer, specified as a character vector or a string scalar. This description appears when the layer is displayed in a `Layer` array. If you do not specify a layer description, then the software displays the layer class name.

`Type`Type of the layer, specified as a character vector or a string scalar. The value of `Type` appears when the layer is displayed in a `Layer` array. If you do not specify a layer type, then the software displays the layer class name.
`NumInputs`Number of inputs of the layer, specified as a positive integer. If you do not specify this value, then the software automatically sets `NumInputs` to the number of names in `InputNames`. The default value is 1.
`InputNames`Input names of the layer, specified as a cell array of character vectors. If you do not specify this value and `NumInputs` is greater than 1, then the software automatically sets `InputNames` to `{'in1',...,'inN'}`, where `N` is equal to `NumInputs`. The default value is `{'in'}`.
`NumOutputs`Number of outputs of the layer, specified as a positive integer. If you do not specify this value, then the software automatically sets `NumOutputs` to the number of names in `OutputNames`. The default value is 1.
`OutputNames`Output names of the layer, specified as a cell array of character vectors. If you do not specify this value and `NumOutputs` is greater than 1, then the software automatically sets `OutputNames` to `{'out1',...,'outM'}`, where `M` is equal to `NumOutputs`. The default value is `{'out'}`.

If the layer has no other properties, then you can omit the `properties` section.

Tip

If you are creating a layer with multiple inputs, then you must set either the `NumInputs` or `InputNames` properties in the layer constructor. If you are creating a layer with multiple outputs, then you must set either the `NumOutputs` or `OutputNames` properties in the layer constructor. For an example, see Define Custom Deep Learning Layer with Multiple Inputs.

To support code generation:

• Nonscalar properties must have type single, double, or character array.

• Scalar properties must be numeric or have type logical or string.

A PReLU layer does not require any additional properties, so you can remove the `properties` section.

A PReLU layer has only one learnable parameter, the scaling coefficient a. Declare this learnable parameter in the ```properties (Learnable)``` section and call the parameter `Alpha`.

``` properties (Learnable) % Layer learnable parameters % Scaling coefficient Alpha end```

### Create Constructor Function

Create the function that constructs the layer and initializes the layer properties. Specify any variables required to create the layer as inputs to the constructor function.

The PReLU layer constructor function requires two input arguments: the number of channels of the expected input data and the layer name. The number of channels specifies the size of the learnable parameter `Alpha`. Specify two input arguments named `numChannels` and `name` in the `codegenPreluLayer` function. Add a comment to the top of the function that explains the syntax of the function.

``` function layer = codegenPreluLayer(numChannels, name) % layer = codegenPreluLayer(numChannels) creates a PReLU layer with % numChannels channels and specifies the layer name. ... end```

Code generation does not support `arguments` blocks.

#### Initialize Layer Properties

Initialize the layer properties, including learnable parameters, in the constructor function. Replace the comment ```% Layer constructor function goes here``` with code that initializes the layer properties.

Set the `Name` property to the input argument `name`.

``` % Set layer name. layer.Name = name;```

Give the layer a one-line description by setting the `Description` property of the layer. Set the description to describe the type of layer and its size.

``` % Set layer description. layer.Description = "PReLU with " + numChannels + " channels";```

For a PReLU layer, when the input values are negative, the layer multiplies each channel of the input by the corresponding channel of `Alpha`. Initialize the learnable parameter `Alpha` as a random vector of size 1-by-1-by-`numChannels`. With the third dimension specified as size `numChannels`, the layer can use element-wise multiplication of the input in the forward function. `Alpha` is a property of the layer object, so you must assign the vector to `layer.Alpha`.

``` % Initialize scaling coefficient. layer.Alpha = rand([1 1 numChannels]);```

View the completed constructor function.

``` function layer = codegenPreluLayer(numChannels, name) % layer = codegenPreluLayer(numChannels, name) creates a PReLU % layer for 2-D image input with numChannels channels and specifies % the layer name. % Set layer name. layer.Name = name; % Set layer description. layer.Description = "PReLU with " + numChannels + " channels"; % Initialize scaling coefficient. layer.Alpha = rand([1 1 numChannels]); end```

With this constructor function, the command `codegenPreluLayer(3,'prelu')` creates a PReLU layer with three channels and the name `'prelu'`.

### Create Forward Functions

Create the layer forward functions to use at prediction time and training time.

Create a function named `predict` that propagates the data forward through the layer at prediction time and outputs the result.

The syntax for `predict` is ```[Z1,…,Zm] = predict(layer,X1,…,Xn)```, where `X1,…,Xn` are the `n` layer inputs and `Z1,…,Zm` are the `m` layer outputs. The values `n` and `m` must correspond to the `NumInputs` and `NumOutputs` properties of the layer.

Tip

If the number of inputs to `predict` can vary, then use `varargin` instead of `X1,…,Xn`. In this case, `varargin` is a cell array of the inputs, where `varargin{i}` corresponds to `Xi`. If the number of outputs can vary, then use `varargout` instead of `Z1,…,Zm`. In this case, `varargout` is a cell array of the outputs, where `varargout{j}` corresponds to `Zj`.

Because a PReLU layer has only one input and one output, the syntax for `predict` for a PReLU layer is ```Z = predict(layer,X)```.

Code generation supports custom intermediate layers with 2-D image input only. The inputs are h-by-w-by-c-by-N arrays, where h, w, and c correspond to the height, width, and number of channels of the images, respectively, and N is the number of observations. The observation dimension is 4.

For code generation support, all the layer inputs must have the same number of dimensions and batch size.

By default, the layer uses `predict` as the forward function at training time. To use a different forward function at training time, or retain a value required for a custom backward function, you must also create a function named `forward`. The software does not generate code for the `forward` function but it must be code generation compatible.

The `forward` function propagates the data forward through the layer at training time and also outputs a memory value.

The syntax for `forward` is ```[Z1,…,Zm,memory] = forward(layer,X1,…,Xn)```, where `X1,…,Xn` are the `n` layer inputs, `Z1,…,Zm` are the `m` layer outputs, and `memory` is the memory of the layer.

Tip

If the number of inputs to `forward` can vary, then use `varargin` instead of `X1,…,Xn`. In this case, `varargin` is a cell array of the inputs, where `varargin{i}` corresponds to `Xi`. If the number of outputs can vary, then use `varargout` instead of `Z1,…,Zm`. In this case, `varargout` is a cell array of the outputs, where `varargout{j}` corresponds to `Zj` for `j` = 1,…,`NumOutputs` and `varargout{NumOutputs + 1}` corresponds to `memory`.

The PReLU operation is given by

where ${x}_{i}$ is the input of the nonlinear activation f on channel i, and ${\alpha }_{i}$ is the coefficient controlling the slope of the negative part. The subscript i in ${\alpha }_{i}$ indicates that the nonlinear activation can vary on different channels.

Implement this operation in `predict`. In `predict`, the input `X` corresponds to x in the equation. The output `Z` corresponds to $f\left({x}_{i}\right)$.

Add a comment to the top of the function that explains the syntaxes of the function.

Tip

If you preallocate arrays using functions such as `zeros`, then you must ensure that the data types of these arrays are consistent with the layer function inputs. To create an array of zeros of the same data type as another array, use the `'like'` option of `zeros`. For example, to initialize an array of zeros of size `sz` with the same data type as the array `X`, use `Z = zeros(sz,'like',X)`.

Implementing the `backward` function is optional when the forward functions fully support `dlarray` input. For code generation support, the `predict` function must also support numeric input.

One way to calculate the output of the PReLU operation is to use the following code.

`Z = max(X,0) + layer.Alpha .* min(0,X);`
Because code generation does not support implicit expansion via the `.*` operation, you can use the `bsxfun` function instead.
`Z = max(X,0) + bsxfun(@times, layer.Alpha, min(0,X));`
However, the `bsxfun` does not support `dlarray` input. To implement the `predict` function, which supports both code generation and `dlarray` input, use an `if` statement with the `isdlarray` function to select the appropriate code for the type of input.

``` function Z = predict(layer, X) % Z = predict(layer, X) forwards the input data X through the % layer and outputs the result Z. if isdlarray(X) Z = max(X,0) + layer.Alpha .* min(0,X); else Z = max(X,0) + bsxfun(@times, layer.Alpha, min(0,X)); end end```

Because the `predict` function fully supports `dlarray` objects, defining the `backward` function is optional. For a list of functions that support `dlarray` objects, see List of Functions with dlarray Support.

### Completed Layer

View the completed layer class file.

```classdef codegenPreluLayer < nnet.layer.Layer % Example custom PReLU layer with codegen support. %#codegen properties (Learnable) % Layer learnable parameters % Scaling coefficient Alpha end methods function layer = codegenPreluLayer(numChannels, name) % layer = codegenPreluLayer(numChannels, name) creates a PReLU % layer for 2-D image input with numChannels channels and specifies % the layer name. % Set layer name. layer.Name = name; % Set layer description. layer.Description = "PReLU with " + numChannels + " channels"; % Initialize scaling coefficient. layer.Alpha = rand([1 1 numChannels]); end function Z = predict(layer, X) % Z = predict(layer, X) forwards the input data X through the % layer and outputs the result Z. if isdlarray(X) Z = max(X,0) + layer.Alpha .* min(0,X); else Z = max(X,0) + bsxfun(@times, layer.Alpha, min(0,X)); end end end end```

### Check Layer for Code Generation Compatibility

Check the code generation compatibility of the custom layer `codegenPreluLayer`.

Define a custom PReLU layer with code generation support. To create this layer, save the file `codegenPreluLayer.m` in the current folder.

Create an instance of the layer and check its validity using `checkLayer`. Specify the valid input size as the size of a single observation of typical input to the layer. The layer expects 4-D array inputs, where the first three dimensions correspond to the height, width, and number of channels of the previous layer output, and the fourth dimension corresponds to the observations.

Specify the typical size of the input of an observation and set the `'ObservationDimension'` option to 4. To check for code generation compatibility, set the `'CheckCodegenCompatibility'` option to `true`.

```layer = codegenPreluLayer(20,'prelu'); validInputSize = [24 24 20]; checkLayer(layer,validInputSize,'ObservationDimension',4,'CheckCodegenCompatibility',true)```
```Skipping GPU tests. No compatible GPU device found. Running nnet.checklayer.TestLayerWithoutBackward .......... ........ Done nnet.checklayer.TestLayerWithoutBackward __________ Test Summary: 18 Passed, 0 Failed, 0 Incomplete, 4 Skipped. Time elapsed: 0.77122 seconds. ```

The function does not detect any issues with the layer.

## References

[1] "Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification." In 2015 IEEE International Conference on Computer Vision (ICCV), 1026–34. Santiago, Chile: IEEE, 2015. https://doi.org/10.1109/ICCV.2015.123.