Lane Detection Optimized with GPU Coder

This example uses:

This example shows how to develop a deep learning lane detection application that runs on NVIDIA® GPUs.

The pretrained lane detection network can detect and output lane marker boundaries from an image and is based on the AlexNet network. The last few layers of the AlexNet network are replaced by a smaller fully connected layer and regression output layer. The example generates a CUDA® executable that runs on a CUDA-enabled GPU on the host machine.

Third-Party Prerequisites

CUDA enabled NVIDIA GPU.
NVIDIA CUDA toolkit and driver.
NVIDIA cuDNN library.
Environment variables for the compilers and libraries. For information on the supported versions of the compilers and libraries, see Third-Party Hardware. For setting up the environment variables, see Setting Up the Prerequisite Products.

Verify GPU Environment

Use the coder.checkGpuInstall function to verify that the compilers and libraries necessary for running this example are set up correctly.

envCfg = coder.gpuEnvConfig('host');
envCfg.DeepLibTarget = 'cudnn';
envCfg.DeepCodegen = 1;
envCfg.Quiet = 1;
coder.checkGpuInstall(envCfg);

Get Pretrained Lane Detection Network

This example uses the trainedLaneNet MAT-file containing the pretrained lane detection network. This file is approximately 143 MB size. Download the file from the MathWorks® website.

laneNetFile = matlab.internal.examples.downloadSupportFile('gpucoder/cnn_models/lane_detection', ...
    'trainedLaneNet.mat');

This network takes an image as an input and outputs two lane boundaries that correspond to the left and right lanes of the ego vehicle. Each lane boundary is represented by the parabolic equation: $y = a x^{2} + b x + c$ , where y is the lateral offset and x is the longitudinal distance from the vehicle. The network outputs the three parameters a, b, and c per lane. The network architecture is similar to AlexNet except that the last few layers are replaced by a smaller fully connected layer and regression output layer.

load(laneNetFile);
disp(laneNet)

  SeriesNetwork with properties:

         Layers: [23×1 nnet.cnn.layer.Layer]
     InputNames: {'data'}
    OutputNames: {'output'}

Convert the SeriesNetwork objects to dlnetwork objects and save it to a different MAT-file.

dlLaneNet = dag2dlnetwork(laneNet);
dlLaneNetFile = 'trainedDlLaneNet.mat';
save(dlLaneNetFile,'dlLaneNet');

Download Test Video

To test the model, the example uses the a video file from the Caltech lanes dataset. The file is approximately 8 MB in size. Download the file from the MathWorks website.

videoFile = matlab.internal.examples.downloadSupportFile('gpucoder/media','caltech_cordova1.avi');

Main Entry-Point Function

The detectLanesInVideo.m file is the main entry-point function for code generation. The detectLanesInVideo function uses the VideoReader object to read frames from the input video, calls the predict method of the LaneNet network object, and draws the detected lanes on the input video. A vision.DeployableVideoPlayer (Computer Vision Toolbox) system object is used to display the lane detected video output.

type detectLanesInVideo.m

function detectLanesInVideo(videoFile,net,laneCoeffMeans,laneCoeffsStds)
%   Copyright 2022-2024 The MathWorks, Inc.

%#codegen

%% Create Video Reader and Video Player Object 
videoFReader   = VideoReader(videoFile);
depVideoPlayer = vision.DeployableVideoPlayer(Name='Lane Detection on GPU');
videoHeight = videoFReader.Height;
videoWidth = videoFReader.Width;
%% Video Frame Processing Loop
while hasFrame(videoFReader)
    videoFrame = imresize(readFrame(videoFReader),[videoHeight videoWidth]);
    scaledFrame = imresize(videoFrame,[227 227]);

    [laneFound,ltPts,rtPts] = laneNetPredict(net,scaledFrame, ...
        laneCoeffMeans,laneCoeffsStds);
    lnaeFound = 1;
    if(laneFound)
        pts = [reshape(ltPts',1,[]);reshape(rtPts',1,[])];
        videoFrame = insertShape(videoFrame, 'Line', pts, 'LineWidth', 4);
    
    depVideoPlayer(videoFrame);
    end
end

LaneNet Predict Function

The laneNetPredict function computes the right and left lane positions in a single video frame. The laneNet network computes parameters a, b, and c that describe the parabolic equation for the left and right lane boundaries. From these parameters, compute the x and y coordinates corresponding to the lane positions. The coordinates must be mapped to image coordinates.

type laneNetPredict.m

function [laneFound,ltPts,rtPts] = laneNetPredict(net,frame,means,stds) 
%#codegen

% laneNetPredict Predict lane markers on the input image frame using the
% lane detection network

%   Copyright 2017-2024 The MathWorks, Inc.

dlFrame = dlarray(single(frame),'SSC');
persistent dllaneNet;
if isempty(dllaneNet)
    dllaneNet = coder.loadDeepLearningNetwork(net, 'dllaneNet');
end

dllanecoeffsNetworkOutput = predict(dllaneNet,dlFrame);
lanecoeffsNetworkOutput = extractdata(dllanecoeffsNetworkOutput);

% Recover original coeffs by reversing the normalization steps.
params = lanecoeffsNetworkOutput' .* stds + means;

% 'c' should be more than 0.5 for it to be a lane.
isRightLaneFound = abs(params(6)) > 0.5;
isLeftLaneFound =  abs(params(3)) > 0.5;

% From the networks output, compute left and right lane points in the image
% coordinates.
vehicleXPoints = 3:30;
ltPts = coder.nullcopy(zeros(28,2,'single'));
rtPts = coder.nullcopy(zeros(28,2,'single'));

if isRightLaneFound && isLeftLaneFound
    rtBoundary = params(4:6);
    rt_y = computeBoundaryModel(rtBoundary, vehicleXPoints);
    
    ltBoundary = params(1:3);
    lt_y = computeBoundaryModel(ltBoundary, vehicleXPoints);

    % Visualize lane boundaries of the ego vehicle.
    tform = get_tformToImage;

    % Map vehicle to image coordinates.
    ltPts =  tform.transformPointsInverse([vehicleXPoints', lt_y']);
    rtPts =  tform.transformPointsInverse([vehicleXPoints', rt_y']);
    laneFound = true;
else
    laneFound = false;
end
end

%% Helper Functions

% Compute boundary model.
function yWorld = computeBoundaryModel(model, xWorld)
yWorld = polyval(model, xWorld);
end

% Compute extrinsics.
function tform = get_tformToImage

%The camera coordinates are described by the caltech mono
% camera model.
yaw = 0;
pitch = 14; % Pitch of the camera in degrees
roll = 0;

translation = translationVector(yaw, pitch, roll);
rotation    = rotationMatrix(yaw, pitch, roll);

% Construct a camera matrix.
focalLength    = [309.4362, 344.2161];
principalPoint = [318.9034, 257.5352];
Skew = 0;

camMatrix = [rotation; translation] * intrinsicMatrix(focalLength, ...
    Skew, principalPoint);

% Turn camMatrix into 2-D homography.
tform2D = [camMatrix(1,:); camMatrix(2,:); camMatrix(4,:)]; % drop Z

tform = projective2d(tform2D);
tform = tform.invert();
end

% Translate to image co-ordinates.
function translation = translationVector(yaw, pitch, roll)
SensorLocation = [0 0];
Height = 2.1798;    % mounting height in meters from the ground
rotationMatrix = (...
    rotZ(yaw)*... % last rotation
    rotX(90-pitch)*...
    rotZ(roll)... % first rotation
    );


% Adjust for the SensorLocation by adding a translation.
sl = SensorLocation;

translationInWorldUnits = [sl(2), sl(1), Height];
translation = translationInWorldUnits*rotationMatrix;
end

% Rotation around X-axis.
function R = rotX(a)
a = deg2rad(a);
R = [...
    1   0        0;
    0   cos(a)  -sin(a);
    0   sin(a)   cos(a)];

end

% Rotation around Y-axis.
function R = rotY(a)
a = deg2rad(a);
R = [...
    cos(a)  0 sin(a);
    0       1 0;
    -sin(a) 0 cos(a)];

end

% Rotation around Z-axis.
function R = rotZ(a)
a = deg2rad(a);
R = [...
    cos(a) -sin(a) 0;
    sin(a)  cos(a) 0;
    0       0      1];
end

% Given the Yaw, Pitch, and Roll, determine the appropriate Euler angles
% and the sequence in which they are applied to align the camera's
% coordinate system with the vehicle coordinate system. The resulting
% matrix is a Rotation matrix that together with the Translation vector
% defines the extrinsic parameters of the camera.
function rotation = rotationMatrix(yaw, pitch, roll)
rotation = (...
    rotY(180)*...            % last rotation: point Z up
    rotZ(-90)*...            % X-Y swap
    rotZ(yaw)*...            % point the camera forward
    rotX(90-pitch)*...       % "un-pitch"
    rotZ(roll)...            % 1st rotation: "un-roll"
    );
end

% Intrinsic matrix computation.
function intrinsicMat = intrinsicMatrix(FocalLength, Skew, PrincipalPoint)
intrinsicMat = ...
    [FocalLength(1)  , 0                     , 0; ...
    Skew             , FocalLength(2)   , 0; ...
    PrincipalPoint(1), PrincipalPoint(2), 1];
end

Generate CUDA Executable

To generate a standalone CUDA executable for the detectLanesInVideo entry-point function, create a GPU code configuration object for 'exe' target and set the target language to C++. Use the coder.DeepLearningConfig function to create a CuDNN deep learning configuration object and assign it to the DeepLearningConfig property of the GPU code configuration object.

cfg = coder.gpuConfig('exe');
cfg.DeepLearningConfig = coder.DeepLearningConfig('cudnn');
cfg.GenerateReport = true;
cfg.GenerateExampleMain = "GenerateCodeAndCompile";
cfg.TargetLang = 'C++';
inputs = {coder.Constant(videoFile),coder.Constant(dlLaneNetFile), ...
    coder.Constant(laneCoeffMeans),coder.Constant(laneCoeffsStds)};

Run the codegen command.

codegen -args inputs -config cfg detectLanesInVideo

Code generation successful: View report

Run the Executable

To run the executable, uncomment the following lines of code.

if ispc
    [status,cmdout] = system("detectLanesInVideo.exe");
else
    [status,cmdout] = system("./detectLanesInVideo");
end