Code Generation for a Sequence-to-Sequence LSTM Network

This example shows how to generate CUDA® code for a long short-term memory (LSTM) network. The example generates a MEX application that makes predictions at each step of an input timeseries. This example uses accelerometer sensor data from a smartphone carried on the body and makes predictions on the activity of the wearer. User movements are classified into one of five categories, namely dancing, running, sitting, standing, and walking. The example uses a pretrained LSTM network. For more information on training, see the Sequence Classification Using Deep Learning example from Deep Learning Toolbox™.


  • CUDA enabled NVIDIA® GPU with compute capability 3.5 or higher.

  • NVIDIA CUDA toolkit and driver.

  • NVIDIA cuDNN library.

  • Environment variables for the compilers and libraries. For information on the supported versions of the compilers and libraries, see Third-party Products (GPU Coder). For setting up the environment variables, see Setting Up the Prerequisite Products (GPU Coder).

  • GPU Coder Interface for Deep Learning Libraries support package. To install this support package, use the Add-On Explorer.

Verify GPU Environment

Use the coder.checkGpuInstall function to verify that the compilers and libraries necessary for running this example are set up correctly.

envCfg = coder.gpuEnvConfig('host');
envCfg.DeepLibTarget = 'cudnn';
envCfg.DeepCodegen = 1;
envCfg.Quiet = 1;

The lstmnet_predict Entry-Point Function

A sequence-to-sequence LSTM network enables you to make different predictions for each individual time step of a data sequence. The lstmnet_predict.m entry-point function takes an input sequence and passes it to a trained LSTM network for prediction. Specifically, the function uses the LSTM network trained in the Sequence to Sequence Classification Using Deep Learning example. The function loads the network object from the lstmnet_predict.mat file into a persistent variable and reuses the persistent object on subsequent prediction calls.

To display an interactive visualization of the network architecture and information about the network layers, use the analyzeNetwork function.

function out = lstmnet_predict(in) %#codegen

% Copyright 2019 The MathWorks, Inc. 

persistent mynet;

if isempty(mynet)
    mynet = coder.loadDeepLearningNetwork('lstmnet.mat');

% pass in input   
out = predict(mynet,in); 

Generate CUDA MEX

To generate CUDA MEX for the lstmnet_predict.m entry-point function, create a GPU configuration object and specify the target to be MEX. Set the target language to C++. Create a deep learning configuration object that specifies the target library as cuDNN. Attach this deep learning configuration object to the GPU configuration object.

cfg = coder.gpuConfig('mex');
cfg.TargetLang = 'C++';
cfg.DeepLearningConfig = coder.DeepLearningConfig('cudnn');

At compile time, GPU Coder™ must know the data types of all the inputs to the entry-point function. Specify the type and size of the input argument to the codegen command by using the coder.typeof function. For this example, the input is of double data type with a feature dimension value of three and a variable sequence length. Specifying the sequence length as variable-sized enables us to perform prediction on an input sequence of any length.

matrixInput = coder.typeof(double(0),[3 Inf],[false true]);

Run the codegen command.

codegen -config cfg lstmnet_predict -args {matrixInput} -report
Code generation successful: To view the report, open('codegen/mex/lstmnet_predict/html/report.mldatx').

Run Generated MEX on Test Data

Load the HumanActivityValidate MAT-file. This MAT-file stores the variable XValidate that contains sample timeseries of sensor readings on which you can test the generated code. Call lstmnet_predict_mex on the first observation.

load HumanActivityValidate
YPred1 = lstmnet_predict_mex(XValidate{1});

YPred1 is a 5-by-53888 numeric matrix containing the probabilities of the five classes for each of the 53888 time steps. For each time step, find the predicted class by calculating the index of the maximum probability.

[~, maxIndex] = max(YPred1, [], 1);

Associate the indices of max probability to the corresponding label. Display the first ten labels. From the results, you can see that the network predicted the human to be sitting for the first ten time steps.

labels = categorical({'Dancing', 'Running', 'Sitting', 'Standing', 'Walking'});
predictedLabels = labels(maxIndex);
  Columns 1 through 6

     Sitting      Sitting      Sitting      Sitting      Sitting      Sitting 

  Columns 7 through 10

     Sitting      Sitting      Sitting      Sitting 

Compare Predictions with Test Data

Use a plot to compare the MEX output data with the test data.

hold on
hold off

xlabel("Time Step")
title("Predicted Activities")
legend(["Predicted" "Test Data"])

Call Generated MEX on an Observation with Different Sequence Length

Call lstmnet_predict_mex on the second observation with a different sequence length. In this example, Validate{2} has a sequence length of 64480 whereas Validate{1} had a sequence length of 53888. The generated code handles prediction correctly because we specified the sequence length dimension to be variable-size.

YPred2 = lstmnet_predict_mex(XValidate{2});

Generate MEX that takes in Multiple Observations

If you want to perform prediction on many observations at once, you can group the observations together in a cell array and pass the cell array for prediction. The cell array must be a column cell array, and each cell must contain one observation. Each observation must have the same feature dimension, but the sequence lengths may vary. In this example, XValidate contains five observations. To generate a MEX that can take XValidate as input, specify the input type to be a 5-by-1 cell array. Further, specify that each cell be of the same type as matrixInput, the type you specified for the single observation in the previous codegen command.

matrixInput = coder.typeof(double(0),[3 Inf],[false true]);
cellInput = coder.typeof({matrixInput}, [5 1]);

codegen -config cfg lstmnet_predict -args {cellInput} -report

YPred3 = lstmnet_predict_mex(XValidate);
Code generation successful: To view the report, open('codegen/mex/lstmnet_predict/html/report.mldatx').

The output is a 5-by-1 cell array of predictions for the five observations passed in.

    [5×53888 single]
    [5×64480 single]
    [5×53696 single]
    [5×56416 single]
    [5×50688 single]