Main Content


Run forward pass on Mask R-CNN network



outputFeatures = forward(detector,dlX) calculates features of the image dlX from the output layers of the Mask R-CNN object detector.

[outputFeatures,state] = forward(detector,dlX) also returns the state information of the network. Use the state to update the network parameters.


This function requires the Computer Vision Toolbox™ Model for Mask R-CNN Instance Segmentation. You can install the Computer Vision Toolbox Model for Mask R-CNN Instance Segmentation from Add-On Explorer. For more information about installing add-ons, see Get and Manage Add-Ons. To run this function, you will require the Deep Learning Toolbox™.


collapse all

Load a pretrained Mask R-CNN object detector.

detector = maskrcnn("resnet50-coco");

Read an image to use for training, and convert the image to a formatted dlarray object.

I = imread("visionteam.jpg");
dlX = dlarray(single(I),"SSCB"); 

Calculate features of the training image.

outputFeatures = forward(detector,dlX);

Input Arguments

collapse all

Mask R-CNN object detector, specified as a maskrcnn object.

Training data, specified as a formatted dlarray (Deep Learning Toolbox) object containing real, nonsparse data. The dimension labels of the data must be "SSCB".

Output Arguments

collapse all

Output features, returned as a 1-by-6 cell array. Each element contains activations from an output layer of the network, as described in the table. In the table, numClasses is the number of classes and numAnchors is the number of anchor boxes. B is the number of images in the batch. numProposals is the number of proposals from the region proposal layer.

Network OutputFormat
Region proposal network classification output after the softmax operation

h-by-w-by-numAnchors-by-B array. The feature map has spatial size h-by-w.

Region proposal network regression output

h-by-w-by-(4⁢×numAnchors)-by-B array. The feature map has spatial size h-by-w.

Region proposals

5-by-numProposals matrix. Each column of the proposals contains box proposals in the format [xStart, yStart, xEnd, yEnd, batchIdx].

Detection network classification output after the softmax operation

1-by-1-by-(numClasses+1)-by-numProposals array.

Detection network regression output

1-by-1-by-(4×numClasses)-by-numProposals array.

Mask segmentation output

hmask-by-wmask-by-numClasses-by-numProposals array. The mask segmentation output has spatial size hmask-by-wmask.

Updated network state, returned as a table. The network state is a table with three columns:

  • Layer – Layer name, returned as a string scalar.

  • Parameter – Parameter name, returned as a string scalar.

  • Value – Value of parameter, returned as a numeric array object.

The network state contains information remembered by the network between iterations.

Version History

Introduced in R2021b