Computer Vision

Extend deep learning workflows with computer vision applications

Apply deep learning to computer vision applications by using Deep Learning Toolbox™ together with the Computer Vision Toolbox™.

Apps

Image Labeler	Label images for computer vision applications
Video Labeler	Label video for computer vision applications

Functions

expand all

Datastores for Training Data

`boxLabelDatastore`	Datastore for bounding box label data
`pixelLabelDatastore`	Datastore for pixel label data

ViT (Vision Transformer)

`visionTransformer`	Pretrained vision transformer (ViT) neural network (Since R2023b)
`patchEmbeddingLayer`	Patch embedding layer (Since R2023b)

Semantic Segmentation

`unet`	Create U-Net convolutional neural network for semantic segmentation (Since R2024a)
`unet3d`	Create 3-D U-Net convolutional neural network for semantic segmentation of volumetric images (Since R2024a)
`deeplabv3plus`	Create DeepLab v3+ convolutional neural network for semantic image segmentation (Since R2024a)

Object Detection

`rtmdetObjectDetector`	Detect objects using RTMDet object detector (Since R2024b)
`yolov4ObjectDetector`	Detect objects using YOLO v4 object detector (Since R2022a)
`yolov2ObjectDetector`	Detect objects using YOLO v2 object detector
`yolov3ObjectDetector`	Detect objects using YOLO v3 object detector (Since R2021a)
`ssdObjectDetector`	Detect objects using SSD deep learning detector

Instance Segmentation

`solov2`	Segment objects using SOLOv2 instance segmentation network (Since R2023b)
`maskrcnn`	Detect objects using Mask R-CNN instance segmentation (Since R2021b)

Pose Estimation

posemaskrcnn Predict object pose using Pose Mask R-CNN pose estimation (Since R2024a)

Object Tracking and Re-Identification

reidentificationNetwork Re-identification deep learning network for re-identifying and tracking objects (Since R2024a)

Automated Visual Inspection

`yoloxObjectDetector`	Detect objects using YOLOX object detector (Since R2023b)
`efficientADAnomalyDetector`	Detect anomalies using EfficientAD network (Since R2024b)
`patchCoreAnomalyDetector`	Detect anomalies using PatchCore network (Since R2023a)
`fcddAnomalyDetector`	Detect anomalies using fully convolutional data description (FCDD) network for anomaly detection (Since R2022b)
`fastFlowAnomalyDetector`	Detect anomalies using FastFlow network (Since R2023a)

Text Detection and Recognition

detectTextCRAFT Detect texts in images by using CRAFT deep learning model (Since R2022a)

Topics

Image Classification

Train Vision Transformer Network for Image Classification
This example shows how to fine-tune a pretrained vision transformer (ViT) neural network to perform classification on a new collection of images.

Object Detection and Instance Segmentation

Get Started with Object Detection Using Deep Learning (Computer Vision Toolbox)
Perform object detection using deep learning neural networks such as YOLOX, YOLO v4, and SSD.
Get Started with Instance Segmentation Using Deep Learning (Computer Vision Toolbox)
Segment objects using an instance segmentation model such as SOLOv2 or Mask R-CNN.
Choose an Object Detector (Computer Vision Toolbox)
Compare object detection deep learning models, such as YOLOX, YOLO v4, RTMDet, and SSD.
Augment Bounding Boxes for Object Detection
This example shows how to perform common kinds of image and bounding box augmentation as part of object detection workflows.
Import Pretrained ONNX YOLO v2 Object Detector
This example shows how to import a pretrained ONNX™ (Open Neural Network Exchange) you only look once (YOLO) v2 [1] object detection network and use the network to detect objects.
Export YOLO v2 Object Detector to ONNX
This example shows how to export a YOLO v2 object detection network to ONNX™ (Open Neural Network Exchange) model format.
Deploy Object Detection Model as Microservice (MATLAB Compiler SDK)
Use a microservice to detect objects in images.

Automated Visual Inspection

Getting Started with Anomaly Detection Using Deep Learning (Computer Vision Toolbox)
Anomaly detection using deep learning is an increasingly popular approach to automating visual inspection tasks.
Detect Image Anomalies Using Explainable FCDD Network (Computer Vision Toolbox)
Use an anomaly detector to distinguish between normal pills and pills with anomalous chips or contamination.
Classify Defects on Wafer Maps Using Deep Learning (Computer Vision Toolbox)
Classify manufacturing defects on wafer maps using a simple convolutional neural network (CNN).
Detect Image Anomalies Using Pretrained ResNet-18 Feature Embeddings (Computer Vision Toolbox)
Train a similarity-based anomaly detector using one-class learning of feature embeddings extracted from a pretrained ResNet-18 convolutional neural network.
Localize Industrial Defects Using PatchCore Anomaly Detector (Computer Vision Toolbox)
Perform localization of anomalous defects in printed circuit boards (PCBs) using anomaly heat maps generated with the PatchCore anomaly detector.

Semantic Segmentation

Get Started with Semantic Segmentation Using Deep Learning (Computer Vision Toolbox)
Segment objects by class using deep learning networks such as U-Net and DeepLab v3+.
Augment Pixel Labels for Semantic Segmentation
This example shows how to perform common kinds of image and pixel label augmentation as part of semantic segmentation workflows.
Semantic Segmentation Using Dilated Convolutions
This example shows how to train a semantic segmentation network using dilated convolutions.
Semantic Segmentation of Multispectral Images Using Deep Learning (Computer Vision Toolbox)
This example shows how to perform semantic segmentation of a multispectral image with seven channels using U-Net.
Explore Semantic Segmentation Network Using Grad-CAM
This example shows how to explore the predictions of a pretrained semantic segmentation network using Grad-CAM.
Generate Adversarial Examples for Semantic Segmentation (Computer Vision Toolbox)
Generate adversarial examples for a semantic segmentation network using the basic iterative method (BIM).
Prune and Quantize Semantic Segmentation Network
Reduce the memory footprint of a semantic segmentation network and speed-up inference by compressing the network using pruning and quantization.

Video Classification

Activity Recognition from Video and Optical Flow Data Using Deep Learning
This example first shows how to perform activity recognition using a pretrained Inflated 3-D (I3D) two-stream convolutional neural network based video classifier and then shows how to use transfer learning to train such a video classifier using RGB and optical flow data from videos [1].
Gesture Recognition using Videos and Deep Learning
Perform gesture recognition using a pretrained SlowFast video classifier.

Featured Examples

New

Identify Defects in Air Compressors Using Spectrogram Images

Detect and localize defects in acoustic recordings of air compressors using Mel spectrogram images and an EfficientAD anomaly detector.

(Computer Vision Toolbox)

Since R2025a

Detect Small Objects Using Tiled Training of YOLOX Network

Detect small objects in full-resolution images using tiled training of a you only look once version X (YOLOX) deep learning network.

(Computer Vision Toolbox)

Since R2024b

Automatically Label Ground Truth Using Segment Anything Model

Produce pixel labels for semantic segmentation using the Segment Anything Model (SAM) in the Image Labeler app. The SAM is an automatic segmentation technique that you can use to segment object regions to label with just a few clicks, or automatically segment the entire image and instantaneously create labels for selected regions. In this example, you interactively label pixels for semantic segmentation in two ways.

(Computer Vision Toolbox)

Since R2024b

Detect Defects Using Tiled Training of EfficientAD Anomaly Detector

Detect and localize defects on anomalous chewing gum images by training an EfficientAD anomaly detection network on tiled normal images.

(Computer Vision Toolbox)

Since R2024b

Localize Industrial Defects Using PatchCore Anomaly Detector

Perform localization of anomalous defects in printed circuit boards (PCBs) using anomaly heat maps generated with the PatchCore anomaly detector.

(Computer Vision Toolbox)

Detect Defects on Printed Circuit Boards Using YOLOX Network

Detect, localize, and classify defects in printed circuit boards (PCBs) using a you only look once version X (YOLOX) deep learning network.

(Computer Vision Toolbox)

Perform 6-DoF Pose Estimation for Bin Picking Using Deep Learning

Perform six degrees-of-freedom (6-DoF) pose estimation by estimating the 3-D position and orientation of machine parts in a bin using RGB-D images and a deep learning network.

Open Live Script

Reidentify People Throughout a Video Sequence Using ReID Network

Track people throughout a video sequence using re-identification with a residual network.

Open Live Script

Perform Instance Segmentation Using SOLOv2

Segment object instances of randomly rotated machine parts in a bin using a deep learning SOLOv2 network.

(Computer Vision Toolbox)

Object Detection Using YOLO v2 Deep Learning

Train a you only look once (YOLO) v2 object detector.

Open Live Script

Object Detection Using SSD Deep Learning

Train a Single Shot Detector (SSD).

Open Live Script

Object Detection Using YOLO v4 Deep Learning

Detect objects in images using you only look once version 4 (YOLO v4) deep learning network. In this example, you will

Open Live Script

Perform Instance Segmentation Using Mask R-CNN

Segment individual instances of people and cars using a multiclass mask region-based convolutional neural network (R-CNN).

Open Live Script

Semantic Segmentation Using Deep Learning

Segment an image using a semantic segmentation network.

Open Live Script

Generate Image from Segmentation Map Using Deep Learning

Generate a synthetic image of a scene from a semantic segmentation map.

Open Live Script

Estimate Body Pose Using Deep Learning

Estimate the body pose of one or more people using the OpenPose algorithm.

Open Live Script

Activity Recognition from Video and Optical Flow Data Using Deep Learning

First shows how to perform activity recognition using a pretrained Inflated 3-D (I3D) two-stream convolutional neural network based video classifier and then shows how to use transfer learning to train such a video classifier using RGB and optical flow data from videos [1].

Open Live Script

Gesture Recognition using Videos and Deep Learning

Perform gesture recognition using a pretrained SlowFast video classifier.

Open Live Script

How useful was this information?

Unrated 1 star 2 stars 3 stars 4 stars 5 stars