Main Content

Augment Images for Deep Learning Workflows

This example shows how you can perform common kinds of randomized image augmentation such as geometric transformations, cropping, and adding noise.

Image Processing Toolbox functions enable you to implement common styles of image augmentation. This example demonstrates five common types of transformations:

The example then shows how to apply augmentation to image data in datastores using a combination of multiple types of transformations.

You can use augmented training data to train a network. For an example of training a network using augmented images, see Prepare Datastore for Image-to-Image Regression.

Read and display a sample image. To compare the effect of the different types of image augmentation, each transformation uses the same input image.

imOriginal = imresize(imread("kobi.png"),0.25);
imshow(imOriginal) 

Figure contains an axes object. The axes object contains an object of type image.

Random Image Warping Transformations

The randomAffine2d (Image Processing Toolbox) function creates a randomized 2-D affine transformation from a combination of rotation, translation, scale (resizing), reflection, and shear. You can specify which transformations to include and the range of transformation parameters. If you specify the range as a 2-element numeric vector, then randomAffine2d selects the value of a parameter from a uniform probability distribution over the specified interval. For more control of the range of parameter values, you can specify the range using a function handle.

Control the spatial bounds and resolution of the warped image created by imwarp (Image Processing Toolbox) by using the affineOutputView (Image Processing Toolbox) function.

Rotation

Create a randomized rotation transformation that rotates the input image by an angle selected randomly from the range [-45, 45] degrees.

tform = randomAffine2d(Rotation=[-45 45]); 
outputView = affineOutputView(size(imOriginal),tform);
imAugmented = imwarp(imOriginal,tform,OutputView=outputView);  
imshow(imAugmented)

Figure contains an axes object. The axes object contains an object of type image.

Translation

Create a translation transformation that shifts the input image horizontally and vertically by a distance selected randomly from the range [-50, 50] pixels.

tform = randomAffine2d(XTranslation=[-50 50],YTranslation=[-50 50]);
outputView = affineOutputView(size(imOriginal),tform);
imAugmented = imwarp(imOriginal,tform,OutputView=outputView);
imshow(imAugmented)

Figure contains an axes object. The axes object contains an object of type image.

Scale

Create a scale transformation that resizes the input image using a scale factor selected randomly from the range [1.2, 1.5]. This transformation resizes the image by the same factor in the horizontal and vertical directions.

tform = randomAffine2d(Scale=[1.2,1.5]);
outputView = affineOutputView(size(imOriginal),tform);
imAugmented = imwarp(imOriginal,tform,OutputView=outputView);
imshow(imAugmented)

Figure contains an axes object. The axes object contains an object of type image.

Reflection

Create a reflection transformation that flips the input image with 50% probability in each dimension.

tform = randomAffine2d(XReflection=true,YReflection=true);
outputView = affineOutputView(size(imOriginal),tform);
imAugmented = imwarp(imOriginal,tform,OutputView=outputView);
imshow(imAugmented)

Figure contains an axes object. The axes object contains an object of type image.

Shear

Create a horizontal shear transformation with the shear angle selected randomly from the range [-30, 30].

tform = randomAffine2d(XShear=[-30 30]); 
outputView = affineOutputView(size(imOriginal),tform); 
imAugmented = imwarp(imOriginal,tform,OutputView=outputView);
imshow(imAugmented) 

Figure contains an axes object. The axes object contains an object of type image.

Control Range of Transformation Parameters Using Custom Selection Function

In the preceding transformations, the range of transformation parameters was specified by two-element numeric vectors. For more control of the range of the transformation parameters, specify a function handle instead of a numeric vector. The function handle takes no input arguments and yields a valid value for each parameter.

For example, this code selects a rotation angle from a discrete set of 90 degree rotation angles.

angles = 0:90:270; 
tform = randomAffine2d(Rotation=@() angles(randi(4))); 
outputView = affineOutputView(size(imOriginal),tform); 
imAugmented = imwarp(imOriginal,tform,OutputView=outputView); 
imshow(imAugmented) 

Figure contains an axes object. The axes object contains an object of type image.

Control Fill Value

When you warp an image using a geometric transformation, pixels in the output image can map to a location outside the bounds of the input image. In that case, imwarp assigns a fill value to those pixels in the output image. By default, imwarp selects black as the fill value. You can change the fill value by specifying the 'FillValues' name-value argument.

Create a random rotation transformation, then apply the transformation and specify a gray fill value.

tform = randomAffine2d(Rotation=[-45 45]);
outputView = affineOutputView(size(imOriginal),tform);
imAugmented = imwarp(imOriginal,tform,OutputView=outputView, ...
    FillValues=[128 128 128]);
imshow(imAugmented)

Figure contains an axes object. The axes object contains an object of type image.

Cropping Transformations

To create output images of a desired size, use the randomWindow2d (Image Processing Toolbox) and centerCropWindow2d (Image Processing Toolbox) functions. Be careful to select a window that includes the desired content in the image.

Specify the desired size of the cropped region as a 2-element vector of the form [height, width].

targetSize = [200,100];

Crop the image to the target size from the center of the image.

win = centerCropWindow2d(size(imOriginal),targetSize); 
imCenterCrop = imcrop(imOriginal,win); 
imshow(imCenterCrop)

Figure contains an axes object. The axes object contains an object of type image.

Crop the image to the target size from a random location in the image.

win = randomWindow2d(size(imOriginal),targetSize);
imRandomCrop = imcrop(imOriginal,win); 
imshow(imRandomCrop)

Figure contains an axes object. The axes object contains an object of type image.

Color Transformations

You can randomly adjust the hue, saturation, brightness, and contrast of a color image by using the jitterColorHSV (Image Processing Toolbox) function. You can specify which color transformations are included and the range of transformation parameters.

You can randomly adjust the brightness and contrast of grayscale images by using basic math operations.

Hue Jitter

Hue specifies the shade of color, or a color's position on a color wheel. As hue varies from 0 to 1, colors vary from red through yellow, green, cyan, blue, purple, magenta, and back to red. Hue jitter shifts the apparent shade of colors in an image.

Adjust the hue of the input image by a small positive offset selected randomly from the range [0.05, 0.15]. Colors that were red now appear more orange or yellow, colors that were orange appear yellow or green, and so on.

imJittered = jitterColorHSV(imOriginal,Hue=[0.05 0.15]);
montage({imOriginal,imJittered})

Figure contains an axes object. The axes object contains an object of type image.

Saturation Jitter

Saturation is the purity of color. As saturation varies from 0 to 1, hues vary from gray (indicating a mixture of all colors) to a single pure color. Saturation jitter shifts how dull or vibrant colors are.

Adjust the saturation of the input image by an offset selected randomly from the range [-0.4, -0.1]. The colors in the output image appear more muted, as expected when the saturation decreases.

imJittered = jitterColorHSV(imOriginal,Saturation=[-0.4 -0.1]); 
montage({imOriginal,imJittered})

Figure contains an axes object. The axes object contains an object of type image.

Brightness Jitter

Brightness is the amount of hue. As brightness varies from 0 to 1, colors go from black to white. Brightness jitter shifts the darkness and lightness of an input image.

Adjust the brightness of the input image by an offset selected randomly from the range [-0.3, -0.1]. The image appears darker, as expected when the brightness decreases.

imJittered = jitterColorHSV(imOriginal,Brightness=[-0.3 -0.1]); 
montage({imOriginal,imJittered})

Figure contains an axes object. The axes object contains an object of type image.

Contrast Jitter

Contrast jitter randomly adjusts the difference between the darkest and brightest regions in an input image.

Adjust the contrast of the input image by a scale factor selected randomly from the range [1.2, 1.4]. The contrast increases, such that shadows become darker and highlights become brighter.

imJittered = jitterColorHSV(imOriginal,Contrast=[1.2 1.4]);
montage({imOriginal,imJittered})

Figure contains an axes object. The axes object contains an object of type image.

Brightness and Contrast Jitter of Grayscale Images

You can apply randomized brightness and contrast jitter to grayscale images by using basic math operations.

Convert the sample image to grayscale. Specify a random contrast scale factor in the range [0.8, 1] and a random brightness offset in the range [-0.15, 0.15]. Multiply the image by the contrast scale factor, then add the brightness offset.

imGray = im2gray(im2double(imOriginal)); 
contrastFactor = 1-0.2*rand; 
brightnessOffset = 0.3*(rand-0.5); 
imJittered = imGray.*contrastFactor + brightnessOffset;
imJittered = im2uint8(imJittered);
montage({imGray,imJittered})

Figure contains an axes object. The axes object contains an object of type image.

Randomized Color-to-Grayscale

One type of color augmentation randomly drops the color information from an RGB image while preserving the number of channels expected by the network. This code shows a "random grayscale" transformation in which an RGB image is randomly converted with 80% probability to a three channel output image where R == G == B.

desiredProbability = 0.8;
if rand <= desiredProbability
    imJittered = repmat(rgb2gray(imOriginal),[1 1 3]);
end
imshow(imJittered) 

Figure contains an axes object. The axes object contains an object of type image.

Other Image Processing Operations

Use the transform function to apply any combination of Image Processing Toolbox functions to input images. Adding noise and blur are two common image processing operations used in deep learning applications.

Synthetic Noise

To apply synthetic noise to an input image, use the imnoise (Image Processing Toolbox) function. You can specify which noise model to use, such as Gaussian, Poisson, salt and pepper, and multiplicative noise. You can also specify the strength of the noise.

imSaltAndPepperNoise = imnoise(imOriginal,"salt & pepper",0.1);
imGaussianNoise = imnoise(imOriginal,"gaussian");
montage({imSaltAndPepperNoise,imGaussianNoise}) 

Figure contains an axes object. The axes object contains an object of type image.

Synthetic Blur

To apply randomized Gaussian blur to an image, use the imgaussfilt (Image Processing Toolbox) function. You can specify the amount of smoothing.

sigma = 1+5*rand; 
imBlurred = imgaussfilt(imOriginal,sigma); 
imshow(imBlurred)

Figure contains an axes object. The axes object contains an object of type image.

Apply Augmentation to Image Data in Datastores

In practical deep learning problems, the image augmentation pipeline typically combines multiple operations. Datastores are a convenient way to read and augment collections of images.

This section of the example shows how to define data augmentation pipelines that augment datastores in the context of training image classification and image regression problems.

First, create an imageDatastore that contains unprocessed images. The image datastore in this example contains digit images with labels.

digitDatasetPath = fullfile(matlabroot,"toolbox","nnet", ...
    "nndemos","nndatasets","DigitDataset");
imds = imageDatastore(digitDatasetPath, ...
    IncludeSubfolders=true,LabelSource="foldernames");
imds.ReadSize = 6;

Image Classification

In image classification, the classifier should learn that a randomly altered version of an image still represents the same image class. To augment data for image classification, it is sufficient to augment the input images while leaving the corresponding categorical labels unchanged.

Augment images in the pristine image datastore with random Gaussian blur, salt and pepper noise, and randomized scale and rotation. These operations are defined in the helper function classificationAugmentationPipeline at the end of this example. Apply data augmentation to the training data by using the transform function.

dsTrain = transform(imds,@classificationAugmentationPipeline, ...
    IncludeInfo=true);

Visualize a sample of the output coming from the augmented pipeline.

dataPreview = preview(dsTrain);
montage(dataPreview(:,1))
title("Augmented Images for Image Classification")

Figure contains an axes object. The axes object with title Augmented Images for Image Classification contains an object of type image.

Image Regression

Image augmentation for image-to-image regression is more complicated because you must apply identical geometric transformations to the input and response images. Associate pairs of input and response images by using the combine function. Transform one or both images in each pair by using the transform function.

Combine two identical copies of the image datastore imds. When data is read from the combined datastore, image data is returned in a two-column cell array, where the first column represents network input images and the second column contains network responses.

dsCombined = combine(imds,imds);
montage(preview(dsCombined)',Size=[6 2])
title("Combined Input and Response Pairs Before Augmentation")

Figure contains an axes object. The axes object with title Combined Input and Response Pairs Before Augmentation contains an object of type image.

Augment each pair of training images with a series of image processing operations:

  • Resize the input and response image to 32-by-32 pixels.

  • Add salt and pepper noise to the input image only.

  • Create a transformation that has randomized scale and rotation.

  • Apply the same transformation to the input and response image.

These operations are defined in the helper function imageRegressionAugmentationPipeline at the end of this example. Apply data augmentation to the training data by using the transform function.

dsTrain = transform(dsCombined,@imageRegressionAugmentationPipeline);
montage(preview(dsTrain)',Size=[6 2])
title("Combined Input and Response Pairs After Augmentation")

Figure contains an axes object. The axes object with title Combined Input and Response Pairs After Augmentation contains an object of type image.

For a complete example that includes training and evaluating an image-to-image regression network, see Prepare Datastore for Image-to-Image Regression.

Supporting Functions

The classificationAugmentationPipeline helper function augments images for classification. dataIn and dataOut are two-element cell arrays, where the first element is the network input image and the second element is the categorical label.

function [dataOut,info] = classificationAugmentationPipeline(dataIn,info)

dataOut = cell([size(dataIn,1),2]);

for idx = 1:size(dataIn,1)
    temp = dataIn{idx};
    
    % Add randomized Gaussian blur
    temp = imgaussfilt(temp,1.5*rand);
    
    % Add salt and pepper noise
    temp = imnoise(temp,"salt & pepper");
    
    % Add randomized rotation and scale
    tform = randomAffine2d(Scale=[0.95,1.05],Rotation=[-30 30]);
    outputView = affineOutputView(size(temp),tform);
    temp = imwarp(temp,tform,OutputView=outputView);
    
    % Form a two-element cell array with the input image and expected response
    dataOut(idx,:) = {temp,info.Label(idx)};
end

end

The imageRegressionAugmentationPipeline helper function augments images for image-to-image regression. dataIn and dataOut are two-element cell arrays, where the first element is the network input image and the second element is the network response image.

function dataOut = imageRegressionAugmentationPipeline(dataIn)

dataOut = cell([size(dataIn,1),2]);
for idx = 1:size(dataIn,1)
    
    % Resize images to 32-by-32 pixels and convert to data type single
    inputImage = im2single(imresize(dataIn{idx,1},[32 32]));
    targetImage = im2single(imresize(dataIn{idx,2},[32 32]));
    
    % Add salt and pepper noise
    inputImage = imnoise(inputImage,"salt & pepper");
    
    % Add randomized rotation and scale
    tform = randomAffine2d(Scale=[0.9,1.1],Rotation=[-30 30]);
    outputView = affineOutputView(size(inputImage),tform);
    
    % Use imwarp with the same tform and outputView to augment both images
    % the same way
    inputImage = imwarp(inputImage,tform,OutputView=outputView);
    targetImage = imwarp(targetImage,tform,OutputView=outputView);
    
    dataOut(idx,:) = {inputImage,targetImage};
end

end

See Also

|

Related Examples

More About