Semantic Segmentation in Point Clouds Using Deep Learning
Segmentation is a fundamental step in processing 3D point clouds. Point cloud
semantic segmentation or classification is a process of
associating each point in a point cloud with a semantic label such as
Segmentation clusters points with similar characteristics into homogeneous regions. These regions correspond to specific structures or objects in a point cloud scene.
These are the major approaches for point cloud semantic segmentation.
Classify each point or a point cluster based on individual features by using feature extraction and neighborhood selection.
Extract point statistics and spatial information from the point cloud to classify the points using statistical and contextual modeling.
Applications for segmentation include urban planning, oceanography, forestry, autonomous driving, robotics, and navigation.
Deep Learning-Based Segmentation
Deep learning is an efficient approach to point cloud semantic segmentation in which you train a network to segment and classify points by extracting features from the input data. To get started on deep learning with point clouds, see Deep Learning with Point Clouds.
You cannot apply standard convolutional neural networks (CNNs) used for image segmentation to raw point clouds due to the unordered, sparse, and unstructured nature of point cloud data. In most cases, you must transform raw point clouds before feeding them as an input to a segmentation network.
These are the categories of deep learning segmentation methods, divided by how you format input to the network.
Multiview-based methods — Reduce 3-D point clouds to 2-D projected images and process them directly using 2-D CNNs. After classification you must postprocess the results to restore the 3-D structure.
Voxel-based methods — Input point cloud data as voxels and use the standard 3-D CNNs. This addresses the unordered and unstructured nature of raw point cloud data.
Point-based methods — Directly apply the deep learning network on individual points.
Out of these, PointNet is the most popular point-based deep learning framework. This network uses multi-layer perceptrons (MLP) to extract local features from individual points, then aggregates all local features into global features using maxpooling. PointNet concatenates the aggregated global and local features into combined point features, and then extracts new features from the combined point features by using MLPs. The network predicts semantic labels based on the new features.
PointNet++ is improved on the basic PointNet model by additionally capturing local features hierarchically. For more information on PointNet++, see Get Started with PointNet++.
Create Training Data for Semantic Segmentation
Lidar Toolbox™ provides functions to import and read raw point cloud data from several file formats. For more information, see I/O.
The toolbox enables you to divide this data into training and test data sets, and store
them as datastore objects. For example, you can store point cloud files by using the
fileDatastore object. For more information on datastore objects, see Datastores for Deep Learning (Deep Learning Toolbox).
The Import Point Cloud Data For Deep Learning example shows you how to import a large point cloud data set, and then configure and load a datastore.
You need a large, labeled data set to train a deep learning network. If you have an unlabeled data set, you can use the Lidar Labeler app to label your training data. For information on how to use the app, see Get Started with the Lidar Labeler.
You can preprocess your data before training the network. Lidar Toolbox provides function to perform various preprocessing tasks.
Denoise, downsample, and filter point clouds.
Convert unorganized data into the organized format.
Divide aerial point cloud data into blocks to perform block-by-block processing.
For more information on preprocessing, see Lidar Processing Applications (Deep Learning Toolbox).
Data augmentation adds variety to the existing training data. The robustness of a network to data transformations increases when you train it on a data set with a lot of variety.
Augmentation techniques reduce overfitting problems and enable the network to better learn and infer features.
For more information on how to perform data augmentation on point clouds, see Data Augmentations for Lidar Object Detection Using Deep Learning.
Create Semantic Segmentation Network
Define your network based on the network input and the layers.
Lidar Toolbox provides these function to create segmentation networks.
You can create a custom network layer-by-layer programmatically. For a list of supported
layers and how to create them, see the List of Deep Learning Layers (Deep Learning Toolbox). To visualize the network
architecture, use the
analyzeNetwork (Deep Learning Toolbox)
You can also design a custom network interactively by using the Deep Network Designer (Deep Learning Toolbox).
Segment Point Clouds and Evaluate Results
Segment Point Cloud
Evaluate Segmentation Results
Evaluate the segmentation results by using the
Pretrained Segmentation Models in Lidar Toolbox
You can use these pretrained semantic segmentation models to perform point cloud segmentation and classification.
|Pretrained Model||Description||Load Pretrained Model||Example|
|PointNet++||PointNet++ is a hierarchical neural network that captures local geometric features to improve the basic PointNet model. For more information, see Get Started with PointNet++.|
Load the pretrained model trained on DALES dataset:
|Aerial Lidar Semantic Segmentation Using PointNet++ Deep Learning|
|SqueezeSegV2||SqueezeSegV2 is a CNN for performing end-to-end semantic segmentation of an organized lidar point cloud. Training this network requires 2-D spherical projected images as inputs to the network.||Download the pretrained model trained on the PandaSet data set from Hesai and Scale: https://www.mathworks.com/supportfiles/lidar/data/trainedPointSegNet.mat||Lidar Point Cloud Semantic Segmentation Using SqueezeSegV2 Deep Learning Network|
|PointSeg||PointSeg is a CNN for performing end-to-end semantic segmentation of road objects based on an organized lidar point cloud. Training this network requires 2-D spherical projected images as inputs to the network.||Download the pretrained model trained on a highway scene data set from an Ouster OS1 sensor: https://www.mathworks.com/supportfiles/lidar/data/trainedPointSegNet.mat||Lidar Point Cloud Semantic Segmentation Using PointSeg Deep Learning Network|
To learn how to generate CUDA® code for a segmentation workflow, see these examples.
This functionality requires Deep Learning Toolbox™ licence.