This example shows how to fine-tune a pretrained SqueezeNet network to classify a new collection of images. This process is called transfer learning and is usually much faster and easier than training a new network, because you can apply learned features to a new task using a smaller number of training images. To prepare a network for transfer learning interactively, use Deep Network Designer.
Extract Data
In the workspace, extract the MathWorks Merch data set. This is a small data set containing 75 images of MathWorks merchandise, belonging to five different classes (cap, cube, playing cards, screwdriver, and torch).
Open SqueezeNet in Deep Network Designer
Open Deep Network Designer with SqueezeNet.
Deep Network Designer displays a zoomed-out view of the whole network in the Designer pane.
Explore the network plot. To zoom in with the mouse, use Ctrl+scroll wheel. To pan, use the arrow keys, or hold down the scroll wheel and drag the mouse. Select a layer to view its properties. Deselect all layers to view the network summary in the Properties pane.
Import Data
To load the data into Deep Network Designer, on the Data tab, click Import Data > Import Image Data. The Import Image Data dialog box opens.
In the Data source list, select Folder. Click Browse and select the extracted MerchData folder.
Divide the data into 70% training data and 30% validation data.
Specify augmentation operations to perform on the training images. For this example, apply a random reflection in the x-axis, a random rotation from the range [-90,90] degrees, and a random rescaling from the range [1,2]. Data augmentation helps prevent the network from overfitting and memorizing the exact details of the training images.
Click Import to import the data into Deep Network Designer.
Visualize Data
Using Deep Network Designer, you can visually inspect the distribution of the training and validation data in the Data pane. You can also view random observations and their labels as a simple check before training. You can see that, in this example, there are five classes in the data set.
Edit Network for Transfer Learning
The convolutional layers of the network extract image features that the last learnable layer and the final classification layer use to classify the input image. These two layers, 'conv10'
and 'ClassificationLayer_predictions'
in SqueezeNet, contain information on how to combine the features that the network extracts into class probabilities, a loss value, and predicted labels. To retrain a pretrained network to classify new images, replace these two layers with new layers adapted to the new data set.
In most networks, the last layer with learnable weights is a fully connected layer. In some networks, such as SqueezeNet, the last learnable layer is the final convolutional layer instead. In this case, replace the convolutional layer with a new convolutional layer with the number of filters equal to the number of classes.
In the Designer pane, drag a new convolution2dLayer
onto the canvas. To match the original convolutional layer, set FilterSize
to 1,1
. Change NumFilters
to the number of classes in the new data, in this example, 5
.
Change the learning rates so that learning is faster in the new layer than in the transferred layers by setting WeightLearnRateFactor
and BiasLearnRateFactor
to 10
. Delete the last 2-D convolutional layer and connect your new layer instead.
Replace the output layer. Scroll to the end of the Layer Library and drag a new classificationLayer
onto the canvas. Delete the original output layer and connect your new layer instead.
Check Network
To make sure your edited network is ready for training, click Analyze, and ensure the Deep Learning Network Analyzer reports zero errors.
Train Network
Specify training options. Select the Training tab and click Training Options.
Set the initial learn rate to a small value to slow down learning in the transferred layers.
Specify the validation frequency so that the accuracy on the validation data is calculated once every epoch.
Specify a small number of epochs. An epoch is a full training cycle on the entire training data set. For transfer learning, you do not need to train for as many epochs.
Specify the mini-batch size, that is, how many images to use in each iteration. To ensure the whole data set is used during each epoch, set the mini-batch size to evenly divide the number of training samples.
For this example, set InitialLearnRate to 0.0001
, ValidationFrequency to 5
, and MaxEpochs to 8
. As there are 55 observations, set MiniBatchSize to 11
.
To train the network with the specified training options, click Close and then click Train.
Deep Network Designer allows you to visualize and monitor training progress. You can then edit the training options and retrain the network, if required.
Export Results and Generate MATLAB Code
To export the network architecture with the trained weights, on the Training tab, select Export > Export Trained Network and Results. Deep Network Designer exports the trained network as the variable trainedNetwork_1
and the training info as the variable trainInfoStruct_1
.
trainInfoStruct_1 = struct with fields:
TrainingLoss: [1×40 double]
TrainingAccuracy: [1×40 double]
ValidationLoss: [3.3420 NaN NaN NaN 2.1187 NaN NaN NaN NaN 1.4291 NaN NaN NaN NaN 0.8527 NaN NaN NaN NaN 0.5849 NaN NaN NaN NaN 0.4678 NaN NaN NaN NaN 0.3967 NaN NaN NaN NaN 0.3875 NaN NaN NaN NaN 0.3749]
ValidationAccuracy: [20 NaN NaN NaN 30 NaN NaN NaN NaN 55.0000 NaN NaN NaN NaN 65 NaN NaN NaN NaN 85 NaN NaN NaN NaN 95 NaN NaN NaN NaN 95 NaN NaN NaN NaN 95 NaN NaN NaN NaN 95]
BaseLearnRate: [1×40 double]
FinalValidationLoss: 0.3749
FinalValidationAccuracy: 95
You can also generate MATLAB code, which recreates the network and the training options used. On the Training tab, select Export > Generate Code for Training. Examine the MATLAB code to learn how to programmatically prepare the data for training, create the network architecture, and train the network.
Classify New Image
Load a new image to classify using the trained network.
Deep Network Designer resizes the images during training to match the network input size. To view the network input size, go to the Designer pane and select the imageInputLayer
(first layer). This network has an input size of 227-by-227.
Resize the test image to match the network input size.
Classify the test image using the trained network.