Layer

Network layer for deep learning

Description

Layers that define the architecture of neural networks for deep learning.

Creation

For a list of deep learning layers in MATLAB®, see List of Deep Learning Layers. To specify the architecture of a neural network with all layers connected sequentially, create an array of layers directly. To specify the architecture of a network where layers can have multiple inputs or outputs, use a LayerGraph object.

Alternatively, you can import layers from Caffe, Keras, and ONNX using importCaffeLayers, importKerasLayers, and importONNXLayers respectively.

To learn how to create your own custom layers, see Define Custom Deep Learning Layers.

Object Functions

trainNetworkTrain neural network for deep learning

Examples

collapse all

Define a convolutional neural network architecture for classification with one convolutional layer, a ReLU layer, and a fully connected layer.

layers = [ ...
    imageInputLayer([28 28 3])
    convolution2dLayer([5 5],10)
    reluLayer
    fullyConnectedLayer(10)
    softmaxLayer
    classificationLayer]
layers = 
  6x1 Layer array with layers:

     1   ''   Image Input             28x28x3 images with 'zerocenter' normalization
     2   ''   Convolution             10 5x5 convolutions with stride [1  1] and padding [0  0  0  0]
     3   ''   ReLU                    ReLU
     4   ''   Fully Connected         10 fully connected layer
     5   ''   Softmax                 softmax
     6   ''   Classification Output   crossentropyex

layers is a Layer object.

Alternatively, you can create the layers individually and then concatenate them.

input = imageInputLayer([28 28 3]);
conv = convolution2dLayer([5 5],10);
relu = reluLayer;
fc = fullyConnectedLayer(10);
sm = softmaxLayer;
co = classificationLayer;

layers = [ ...
    input
    conv
    relu
    fc
    sm
    co]
layers = 
  6x1 Layer array with layers:

     1   ''   Image Input             28x28x3 images with 'zerocenter' normalization
     2   ''   Convolution             10 5x5 convolutions with stride [1  1] and padding [0  0  0  0]
     3   ''   ReLU                    ReLU
     4   ''   Fully Connected         10 fully connected layer
     5   ''   Softmax                 softmax
     6   ''   Classification Output   crossentropyex

Define a convolutional neural network architecture for classification with one convolutional layer, a ReLU layer, and a fully connected layer.

layers = [ ...
    imageInputLayer([28 28 3])
    convolution2dLayer([5 5],10)
    reluLayer
    fullyConnectedLayer(10)
    softmaxLayer
    classificationLayer];

Display the image input layer by selecting the first layer.

layers(1)
ans = 
  ImageInputLayer with properties:

                Name: ''
           InputSize: [28 28 3]

   Hyperparameters
    DataAugmentation: 'none'
       Normalization: 'zerocenter'
        AverageImage: []

View the input size of the image input layer.

layers(1).InputSize
ans = 1×3

    28    28     3

Display the stride for the convolutional layer.

layers(2).Stride
ans = 1×2

     1     1

Access the bias learn rate factor for the fully connected layer.

layers(4).BiasLearnRateFactor
ans = 1

Create a simple directed acyclic graph (DAG) network for deep learning. Train the network to classify images of digits. The simple network in this example consists of:

  • A main branch with layers connected sequentially.

  • A shortcut connection containing a single 1-by-1 convolutional layer. Shortcut connections enable the parameter gradients to flow more easily from the output layer to the earlier layers of the network.

Create the main branch of the network as a layer array. The addition layer sums multiple inputs element-wise. Specify the number of inputs for the addition layer to sum. All layers must have names and all names must be unique.

layers = [
    imageInputLayer([28 28 1],'Name','input')
    
    convolution2dLayer(5,16,'Padding','same','Name','conv_1')
    batchNormalizationLayer('Name','BN_1')
    reluLayer('Name','relu_1')
    
    convolution2dLayer(3,32,'Padding','same','Stride',2,'Name','conv_2')
    batchNormalizationLayer('Name','BN_2')
    reluLayer('Name','relu_2')
    convolution2dLayer(3,32,'Padding','same','Name','conv_3')
    batchNormalizationLayer('Name','BN_3')
    reluLayer('Name','relu_3')
    
    additionLayer(2,'Name','add')
    
    averagePooling2dLayer(2,'Stride',2,'Name','avpool')
    fullyConnectedLayer(10,'Name','fc')
    softmaxLayer('Name','softmax')
    classificationLayer('Name','classOutput')];

Create a layer graph from the layer array. layerGraph connects all the layers in layers sequentially. Plot the layer graph.

lgraph = layerGraph(layers);
figure
plot(lgraph)

Create the 1-by-1 convolutional layer and add it to the layer graph. Specify the number of convolutional filters and the stride so that the activation size matches the activation size of the 'relu_3' layer. This arrangement enables the addition layer to add the outputs of the 'skipConv' and 'relu_3' layers. To check that the layer is in the graph, plot the layer graph.

skipConv = convolution2dLayer(1,32,'Stride',2,'Name','skipConv');
lgraph = addLayers(lgraph,skipConv);
figure
plot(lgraph)

Create the shortcut connection from the 'relu_1' layer to the 'add' layer. Because you specified two as the number of inputs to the addition layer when you created it, the layer has two inputs named 'in1' and 'in2'. The 'relu_3' layer is already connected to the 'in1' input. Connect the 'relu_1' layer to the 'skipConv' layer and the 'skipConv' layer to the 'in2' input of the 'add' layer. The addition layer now sums the outputs of the 'relu_3' and 'skipConv' layers. To check that the layers are connected correctly, plot the layer graph.

lgraph = connectLayers(lgraph,'relu_1','skipConv');
lgraph = connectLayers(lgraph,'skipConv','add/in2');
figure
plot(lgraph);

Load the training and validation data, which consists of 28-by-28 grayscale images of digits.

[XTrain,YTrain] = digitTrain4DArrayData;
[XValidation,YValidation] = digitTest4DArrayData;

Specify training options and train the network. trainNetwork validates the network using the validation data every ValidationFrequency iterations.

options = trainingOptions('sgdm', ...
    'MaxEpochs',8, ...
    'Shuffle','every-epoch', ...
    'ValidationData',{XValidation,YValidation}, ...
    'ValidationFrequency',30, ...
    'Verbose',false, ...
    'Plots','training-progress');
net = trainNetwork(XTrain,YTrain,lgraph,options);

Display the properties of the trained network. The network is a DAGNetwork object.

net
net = 
  DAGNetwork with properties:

         Layers: [16×1 nnet.cnn.layer.Layer]
    Connections: [16×2 table]

Classify the validation images and calculate the accuracy. The network is very accurate.

YPredicted = classify(net,XValidation);
accuracy = mean(YPredicted == YValidation)
accuracy = 0.9968

Introduced in R2016a