Create a faster R-CNN object detection network



lgraph = fasterRCNNLayers(inputImageSize,numClasses,anchorBoxes,network) returns a Faster R-CNN network as a layerGraph (Deep Learning Toolbox) object. A Faster R-CNN network is a convolutional neural network based object detector. The detector predicts the coordinates of bounding boxes, objectness scores, and classification scores for a set of anchor boxes. To train the created network, use the trainFasterRCNNObjectDetector function. For more information, see Getting Started with R-CNN, Fast R-CNN, and Faster R-CNN.

lgraph = fasterRCNNLayers(inputImageSize,numClasses,anchorBoxes,network,featureLayer) returns the object detection network based on the specified featureLayer of the network. Use this syntax when you specify the network as a SeriesNetwork (Deep Learning Toolbox), DAGNetwork (Deep Learning Toolbox), or layerGraph (Deep Learning Toolbox). object.

lgraph = fasterRCNNLayers(___,Name,Value) returns the object detection network with optional input properties specified by one or more name-value pair arguments.

Using this function requires Deep Learning Toolbox™.


collapse all

Specify the image size.

inputImageSize = [224 224 3];

Specify the number of objects to detect.

numClasses = 1;

Use a pretrained ResNet-50 network as the base network for the Faster R-CNN network. You must download the resnet50 (Deep Learning Toolbox) support package.

network = 'resnet50';

Specify the network layer to use for feature extraction. You can use the analyzeNetwork (Deep Learning Toolbox) function to see all the layer names in a network.

featureLayer = 'activation_40_relu';

Specify the anchor boxes. You can also use the estimateAnchorBoxes function to estimate anchor boxes from your training data.

anchorBoxes = [64,64; 128,128; 192,192];

Create the Faster R-CNN object detection network.

lgraph = fasterRCNNLayers(inputImageSize,numClasses,anchorBoxes, ...
lgraph = 
  LayerGraph with properties:

         Layers: [188x1 nnet.cnn.layer.Layer]
    Connections: [205x2 table]
     InputNames: {'input_1'}
    OutputNames: {1x4 cell}

Visualize the network using the network analyzer.


Input Arguments

collapse all

Network input image size, specified as a 3-element vector in the format [height, width, depth]. depth is the number of image channels. Set depth to 3 for RGB images, to 1 for grayscale images, or to the number of channels for multispectral and hyperspectral images.

Number of classes for the network to classify, specified as an integer greater than 1.

Anchor boxes, specified as an M-by-2 matrix of M anchor boxes in the format [height, width]. Anchor boxes are determined based on the scale and aspect ratio of objects in the training data set. For example, if an object is localized by a square window, then you can set the size of the anchor boxes to [64 64;128 128].

Pretrained classification network, specified as a SeriesNetwork (Deep Learning Toolbox), DAGNetwork (Deep Learning Toolbox), or layerGraph (Deep Learning Toolbox), or as on of the following:

When you specify the network as a SeriesNetwork (Deep Learning Toolbox) object, a DAGNetwork (Deep Learning Toolbox) object, or by name, the function transforms the network into a Faster R-CNN network. It transforms the network by adding a region proposal network (RPN), and ROI max pooling layer, and new classification and regression layers to support object detection.

Feature extraction layer, specified as a character vector or a string scalar. Use one of the deeper layers in the network you specify. You can use the analyzeNetwork (Deep Learning Toolbox) function to view the names of the layers in the input network.


You can specify any network layer except the fully connected layer as the feature layer.

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'ROIMaxPoolingLayer','auto'

ROI max pooling layer, specified as a 'auto', 'insert', or 'replace'. You can specify whether a roiMaxPooling2dLayer replaces the pooling layer or follows the feature extraction layer.

If you select 'auto', the function:

  • Inserts a new ROI max pooling layer after the feature extraction layer when the layer next to the feature extraction layer is not a max pooling layer.

  • Replaces the current pooling layer after the feature extraction layer with an ROI max pooling layer.

ROI max pooling layer output size, specifed as 'auto' or a 2-element vector of positive integers. When you set the value to 'auto', the function determines the output size based on the ROIMaxPoolingLayer property. It uses the output size of the feature extraction layer or the pooling layer following the feature extraction layer.

Output Arguments

collapse all

Object detection network, returned as a layerGraph (Deep Learning Toolbox) object. The output and base network imageInputLayer normalization values are equal.

Introduced in R2019b