# Lidar Point Cloud Semantic Segmentation Using SqueezeSegV2 Deep Learning Network

This example shows how to train a SqueezeSegV2 semantic segmentation network on 3-D organized lidar point cloud data.

SqueezeSegV2 [1] is a convolutional neural network (CNN) for performing end-to-end semantic segmentation of an organized lidar point cloud. The training procedure shown in this example requires 2-D spherical projected images as inputs to the deep learning network.

This example uses PandaSet data set from Hesai and Scale [2]. The PandaSet contains 4800 unorganized lidar point cloud scans of the various city scenes captured using the Pandar 64 sensor. The data set provides semantic segmentation labels for 42 different classes including car, road, and pedestrian.

This example uses a subset of PandaSet, that contains 2560 preprocessed organized point clouds. Each point cloud is specified as a 64-by-1856 matrix. The corresponding ground truth contains the semantic segmentation labels for 12 classes. The point clouds are stored in PCD format, and the ground truth data is stored in PNG format. The size of the data set is 5.2 GB. Execute this code to download the data set.

```url = "https://ssd.mathworks.com/supportfiles/lidar/data/Pandaset_LidarData.tar.gz"; outputFolder = fullfile(tempdir,"Pandaset"); lidarDataTarFile = fullfile(outputFolder,"Pandaset_LidarData.tar.gz"); if ~exist(lidarDataTarFile,"file") mkdir(outputFolder); disp("Downloading Pandaset Lidar driving data (5.2 GB)..."); websave(lidarDataTarFile,url); untar(lidarDataTarFile,outputFolder); end % Check if tar.gz file is downloaded, but not uncompressed. if (~exist(fullfile(outputFolder,"Lidar"),"file"))... &&(~exist(fullfile(outputFolder,"semanticLabels"),"file")) untar(lidarDataTarFile,outputFolder); end lidarData = fullfile(outputFolder,"Lidar"); labelsFolder = fullfile(outputFolder,"semanticLabels");```

Depending on your Internet connection, the download process can take some time. The code suspends MATLAB® execution until the download process is complete. Alternatively, you can download the data set to your local disk using your web browser, and then extract `Pandaset_LidarData` folder. The `Pandaset_LidarData` contains `Lidar`, `Cuboids` and `semanticLabels` folders that holds the point clouds, cuboid label and semantic label info respectively. To use the file you downloaded from the web, change the `outputFolder` variable in the code to the location of the downloaded file.

The training procedure for this example is for organized point clouds. For an example showing how to convert unorganized to organized point clouds, see Unorganized to Organized Conversion of Point Clouds Using Spherical Projection (Lidar Toolbox).

Download the pretrained network to avoid having to wait for training to complete. If you want to train the network, set the `doTraining` variable to `true`.

```doTraining = false; pretrainedNetURL = ... "https://ssd.mathworks.com/supportfiles/lidar/data/trainedSqueezeSegV2PandasetNet.zip"; if ~doTraining downloadPretrainedSqueezeSegV2Net(outputFolder,pretrainedNetURL); end```
```Downloading pretrained model (5 MB)... ```

### Prepare Data for Training

#### Load Lidar Point Clouds and Class Labels

Use the `helperTransformOrganizedPointCloudToTrainingData` supporting function, attached to this example, to generate training data from the lidar point clouds. The function uses point cloud data to create five-channel input images. Each training image is specified as a 64-by-1856-by-5 array:

• The height of each image is 64 pixels.

• The width of each image is 1856 pixels.

• Each image has five channels. The five channels specify the 3-D coordinates of the point cloud, intensity, and range: $\mathit{r}=\sqrt{{\mathit{x}}^{2\text{\hspace{0.17em}}}+{\mathit{y}}^{2}+{\mathit{z}}^{2}}$.

A visual representation of the training data follows.

Generate the five-channel training images.

```imagesFolder = fullfile(outputFolder,"images"); helperTransformOrganizedPointCloudToTrainingData(lidarData,imagesFolder);```
```Preprocessing data 100% complete ```

The five-channel images are saved as MAT files.

Processing can take some time. The code suspends MATLAB® execution until processing is complete.

#### Create i`mageDatastore` and p`ixelLabelDatastore`

Create an `imageDatastore` to extract and store the five channels of the 2-D spherical images using imageDatastore and the helperImageMatReader supporting function, which is a custom MAT file reader. This function is attached to this example as a supporting file.

```imds = imageDatastore(imagesFolder, ... "FileExtensions",".mat", ... "ReadFcn",@helperImageMatReader);```

Create a pixel label datastore using `pixelLabelDatastore` (Computer Vision Toolbox) to store pixel-wise labels from the pixel label images. The object maps each pixel label to a class name. In this example, the vegetation, ground, road, road markings, sidewalk, cars, trucks, other vehicles, pedestrian, road barrier, signs, and buildings are the objects of interest; all other pixels are the background. Specify these classes and assign a unique label ID to each class.

```classNames = ["unlabelled" "Vegetation" "Ground" "Road" "RoadMarkings" "SideWalk" "Car" "Truck" "OtherVehicle" "Pedestrian" "RoadBarriers" "Signs" "Buildings"]; numClasses = numel(classNames); % Specify label IDs from 1 to the number of classes. labelIDs = 1 : numClasses; pxds = pixelLabelDatastore(labelsFolder,classNames,labelIDs);```

Load and display one of the labeled images by overlaying it on the corresponding intensity image using the `helperDisplayLidarOverlaidImage` function, defined in the Supporting Functions section of this example.

```% Point cloud (channels 1, 2, and 3 are for location, channel 4 is for intensity, and channel 5 is for range). I = read(imds); labelMap = read(pxds); figure; helperDisplayLidarOverlaidImage(I,labelMap{1,1},classNames); title("Ground Truth");```

#### Prepare Training, Validation, and Test Sets

Use the `helperPartitionLidarSegmentationDataset` supporting function, attached to this example, to split the data into training, validation, and test sets. You can split the training data according to the percentage specified by the `trainingDataPercentage`. Divide the rest of the data in a 2:1 ratio into validation and testing data. Default value of `trainingDataPercentage `is `0.7`.

```[imdsTrain,imdsVal,imdsTest,pxdsTrain,pxdsVal,pxdsTest] = ... helperPartitionLidarSegmentationDataset(imds,pxds,"trainingDataPercentage",0.75);```

Use the `combine` function to combine the pixel label and image datastores for the training and validation data.

```trainingData = combine(imdsTrain,pxdsTrain); validationData = combine(imdsVal,pxdsVal);```

#### Data Augmentation

Data augmentation is used to improve network accuracy by randomly transforming the original data during training. By using data augmentation, you can add more variety to the training data without actually having to increase the number of labeled training samples.

Augment the training data by using the `transform` function with custom preprocessing operations specified by the `helperAugmentData` function, defined in the Supporting Functions section of this example. This function randomly flips the multichannel 2-D image and associated labels in the horizontal direction. Apply data augmentation to only the training data set.

`augmentedTrainingData = transform(trainingData,@(x) helperAugmentData(x));`

### Define Network Architecture

Create a standard SqueezeSegV2 [1] network by using the `squeezesegv2Layers` (Lidar Toolbox) function. In the SqueezeSegV2 network, the encoder subnetwork consists of FireModules interspersed with max-pooling layers. This arrangement successively decreases the resolution of the input image. In addition, the SqueezeSegV2 network uses the focal loss function to mitigate the effect of the imbalanced class distribution on network accuracy. For more details on how to use the focal loss function in semantic segmentation, see `focalLossLayer` (Computer Vision Toolbox).

Execute this code to create a layer graph that can be used to train the network.

```inputSize = [64 1856 5]; lgraph = squeezesegv2Layers(inputSize, ... numClasses,"NumEncoderModules",4,"NumContextAggregationModules",2);```

Use the `analyzeNetwork` function to display an interactive visualization of the network architecture.

`analyzeNetwork(lgraph);`

### Specify Training Options

Use the Adam optimization algorithm to train the network. Use the `trainingOptions` function to specify the hyperparameters.

```maxEpochs = 30; initialLearningRate = 1e-3; miniBatchSize = 8; l2reg = 2e-4; options = trainingOptions("adam", ... "InitialLearnRate",initialLearningRate, ... "L2Regularization",l2reg, ... "MaxEpochs",maxEpochs, ... "MiniBatchSize",miniBatchSize, ... "LearnRateSchedule","piecewise", ... "LearnRateDropFactor",0.1, ... "LearnRateDropPeriod",10, ... "ValidationData",validationData, ... "Plots","training-progress", ... "VerboseFrequency",20);```

Note: Reduce the `miniBatchSize` value to control memory usage when training.

### Train Network

You can train the network yourself by setting the `doTraining` argument to `true`. If you train the network, you can use a CPU or a GPU. Using a GPU requires Parallel Computing Toolbox™ and a supported GPU device. For information on supported devices, see GPU Support by Release (Parallel Computing Toolbox). Otherwise, load a pretrained network.

```if doTraining [net,info] = trainNetwork(trainingData,lgraph,options); else load(fullfile(outputFolder,"trainedSqueezeSegV2PandasetNet.mat"),"net"); end```

### Predict Results on Test Point Cloud

Use the trained network to predict results on a test point cloud and display the segmentation result. First, read a five-channel input image and predict the labels using the trained network.

Display the figure with the segmentation as an overlay.

```I = read(imdsTest); predictedResult = semanticseg(I,net); figure; helperDisplayLidarOverlaidImage(I,predictedResult,classNames); title("Semantic Segmentation Result");```

Use the `helperDisplayLabelOverlaidPointCloud` function, defined in the Supporting Functions section of this example, to display the segmentation result on the point cloud.

```figure; helperDisplayLabelOverlaidPointCloud(I,predictedResult); view([39.2 90.0 60]); title("Semantic Segmentation Result on Point Cloud");```

### Evaluate Network

Use the `evaluateSemanticSegmentation` (Computer Vision Toolbox) function to compute the semantic segmentation metrics from the test set results.

```outputLocation = fullfile(tempdir,"output"); if ~exist(outputLocation,"dir") mkdir(outputLocation); end pxdsResults = semanticseg(imdsTest,net, ... "MiniBatchSize",4, ... "WriteLocation",outputLocation, ... "Verbose",false); metrics = evaluateSemanticSegmentation(pxdsResults,pxdsTest,"Verbose",false);```

You can measure the amount of overlap per class using the intersection-over-union (IoU) metric.

The `evaluateSemanticSegmentation` (Computer Vision Toolbox) function returns metrics for the entire data set, for individual classes, and for each test image. To see the metrics at the data set level, use the `metrics.DataSetMetrics` property.

`metrics.DataSetMetrics`
```ans=1×5 table GlobalAccuracy MeanAccuracy MeanIoU WeightedIoU MeanBFScore ______________ ____________ _______ ___________ ___________ 0.89724 0.61685 0.54431 0.81806 0.74537 ```

The data set metrics provide a high-level overview of network performance. To see the impact each class has on the overall performance, inspect the metrics for each class using the `metrics.ClassMetrics` property.

`metrics.ClassMetrics `
```ans=13×3 table Accuracy IoU MeanBFScore ________ _______ ___________ unlabelled 0.94 0.9005 0.99911 Vegetation 0.77873 0.64819 0.95466 Ground 0.69019 0.59089 0.60657 Road 0.94045 0.83663 0.99084 RoadMarkings 0.37802 0.34149 0.77073 SideWalk 0.7874 0.65668 0.93687 Car 0.9334 0.81065 0.95448 Truck 0.30352 0.27401 0.37273 OtherVehicle 0.64397 0.58108 0.47253 Pedestrian 0.26214 0.20896 0.45918 RoadBarriers 0.23955 0.21971 0.19433 Signs 0.17276 0.15613 0.44275 Buildings 0.94891 0.85117 0.96929 ```

Although the overall network performance is good, the class metrics for some classes like `RoadMarkings` and `Truck` indicate that more training data is required for better performance.

### Supporting Functions

#### Function to Augment Data

The `helperAugmentData` function randomly flips the spherical image and associated labels in the horizontal direction.

```function out = helperAugmentData(inp) % Apply random horizontal flipping. out = cell(size(inp)); % Randomly flip the five-channel image and pixel labels horizontally. I = inp{1}; sz = size(I); tform = randomAffine2d("XReflection",true); rout = affineOutputView(sz,tform,"BoundsStyle","centerOutput"); out{1} = imwarp(I,tform,"OutputView",rout); out{2} = imwarp(inp{2},tform,"OutputView",rout); end```

#### Function to Display Lidar Segmentation Map Overlaid on 2-D Spherical Image

The `helperDisplayLidarOverlaidImage` function overlays the semantic segmentation map over the intensity channel of the 2-D spherical image. The function also resizes the overlaid image for better visualization.

```function helperDisplayLidarOverlaidImage(lidarImage,labelMap,classNames) % helperDisplayLidarOverlaidImage(lidarImage, labelMap, classNames) % displays the overlaid image. lidarImage is a five-channel lidar input. % labelMap contains pixel labels and classNames is an array of label % names. % Read the intensity channel from the lidar image. intensityChannel = uint8(lidarImage(:,:,4)); % Load the lidar color map. cmap = helperPandasetColorMap; % Overlay the labels over the intensity image. B = labeloverlay(intensityChannel,labelMap,"Colormap",cmap,"Transparency",0.4); % Resize for better visualization. B = imresize(B,"Scale",[3 1],"method","nearest"); imshow(B); helperPixelLabelColorbar(cmap,classNames); end```

#### Function to Display Lidar Segmentation Map Overlaid on 3-D Point Cloud

The `helperDisplayLabelOverlaidPointCloud` function overlays the segmentation result over a 3-D organized point cloud.

```function helperDisplayLabelOverlaidPointCloud(I,predictedResult) % helperDisplayLabelOverlaidPointCloud(I, predictedResult) % displays the overlaid pointCloud object. I is the 5 channels organized % input image. predictedResult contains pixel labels. ptCloud = pointCloud(I(:,:,1:3),"Intensity",I(:,:,4)); cmap = helperPandasetColorMap; B = ... labeloverlay(uint8(ptCloud.Intensity),predictedResult,"Colormap",cmap,"Transparency",0.4); pc = pointCloud(ptCloud.Location,"Color",B); figure; ax = pcshow(pc); set(ax,"XLim",[-70 70],"YLim",[-70 70]); zoom(ax,3.5); end```

#### Function to Define Lidar Colormap

The `helperPandasetColorMap` function defines the colormap used by the lidar data set.

```function cmap = helperPandasetColorMap cmap = [[30 30 30]; % Unlabeled [0 255 0]; % Vegetation [255 150 255]; % Ground [255 0 255]; % Road [255 0 0]; % Road Markings [90 30 150]; % Sidewalk [245 150 100]; % Car [250 80 100]; % Truck [150 60 30]; % Other Vehicle [255 255 0]; % Pedestrian [0 200 255]; % Road Barriers [170 100 150]; % Signs [30 30 255]]; % Building cmap = cmap./255; end```

#### Function to Display Pixel Label Colorbar

The `helperPixelLabelColorbar` function adds a colorbar to the current axis. The colorbar is formatted to display the class names with the color.

```function helperPixelLabelColorbar(cmap,classNames) colormap(gca,cmap); % Add a colorbar to the current figure. c = colorbar("peer",gca); % Use class names for tick marks. c.TickLabels = classNames; numClasses = size(classNames,1); % Center tick labels. c.Ticks = 1/(numClasses*2):1/numClasses:1; % Remove tick marks. c.TickLength = 0; end```

The `downloadPretrainedSqueezeSegV2Net` function downloads the pretrained model.
```function downloadPretrainedSqueezeSegV2Net(outputFolder,pretrainedNetURL) preTrainedMATFile = fullfile(outputFolder,"trainedSqueezeSegV2PandasetNet.mat"); preTrainedZipFile = fullfile(outputFolder,"trainedSqueezeSegV2PandasetNet.zip"); if ~exist(preTrainedMATFile,"file") if ~exist(preTrainedZipFile,"file") disp("Downloading pretrained model (5 MB)..."); websave(preTrainedZipFile,pretrainedNetURL); end unzip(preTrainedZipFile,outputFolder); end end```