Main Content

idNeuralNetwork

Multilayer neural network mapping function for nonlinear ARX models and Hammerstein-Wiener models (requires Statistics and Machine Learning Toolbox or Deep Learning Toolbox)

Since R2023b

Description

An idNeuralNetwork object creates a neural network function and is a nonlinear mapping object for estimating nonlinear ARX models and Hammerstein-Wiener models. This mapping object lets you create neural networks using the regression networks of Statistics and Machine Learning Toolbox™ and the deep and shallow networks of Deep Learning Toolbox™.

Mathematically, idNeuralNetwork is a function that maps m inputs X(t) = [x(t1),x2(t),…,xm(t)]T to a single scalar output y(t) using the following relationship:

y(t)=y0+Χ(t)TPL+S(ΧT(t)Q)

Here:

  • X(t) is an m-by-1 vector of inputs, or regressors.

  • y0 is the output offset, a scalar.

  • P and Q are m-by-p and m-by-q projection matrices, respectively.

  • L is a p-by-1 vector of weights.

  • S(.) represents a neural network object of one of the following types:

    • RegressionNeuralNetwork (Statistics and Machine Learning Toolbox) object — Network object created using fitrnet (Statistics and Machine Learning Toolbox)

    • dlnetwork (Deep Learning Toolbox) object — Deep learning network object

    • network (Deep Learning Toolbox) object — Shallow network object created using a command such as feedforwardnet (Deep Learning Toolbox)

    Additionally, a cascade-correlation neural network is a network type that is implemented with deep networks. It allows you to create networks without specifying the network sizes.

See Examples for more information.

Use idNeuralNetwork as the output value, or, for multiple-output systems, one of the output values in the OutputFcn property of an idnlarx model or the InputNonlinearity and OutputNonlinearity properties of an idnlhw object. For example, specify idNeuralNetwork when you estimate an idnlarx model with the following command.

sys = nlarx(data,regressors,idNeuralNetwork)
When nlarx estimates the model, it essentially estimates the parameters of the idNeuralNetwork function.

You can use a similar approach when you specify input or output linearities using the nlhw command. For example, specify idNeuralNetwork as both the input and output nonlinearities with the following command.

sys = nlhw(data,orders,idNeuralNetwork,idNeuralNetwork)

Creation

Description

Create Regression Network or Deep Learning Network

NW = idNeuralNetwork creates an idNeuralNetwork object NW that uses a single hidden layer of ten rectified linear unit (ReLU) activations.

The specific type of network that NW represents depends on the toolboxes you have access to.

  • If you have access to Statistics and Machine Learning Toolbox, then idNeuralNetwork uses fitrnet (Statistics and Machine Learning Toolbox) to create a RegressionNeuralNetwork (Statistics and Machine Learning Toolbox)-based map.

  • If Statistics and Machine Learning Toolbox is not available but you have access to Deep Learning Toolbox, then idNeuralNetwork uses dlnetwork (Deep Learning Toolbox) to create a deep learning network map.

For idnlhw models, the number of inputs to the network is 1. For idnlarx models, the number of inputs is unknown, as this number is determined during estimation. NW also uses a parallel linear function and an offset element.

For multiple-output nonlinear ARX or Hammerstein-Wiener models, create a separate idNeuralNetwork object for each output. Each element of the output function must represent a single-output network object.

example

NW = idNeuralNetwork(LayerSizes) uses the length of LayerSizes as the number of layers, if LayerSizes is a row vector of positive integers. Each ith element in LayerSizes specifies the number of activations in the corresponding ith layer.

example

NW = idNeuralNetwork(LayerSizes,Activations) specifies the types of activation to use in each layer. The combination of the Activations specification and the available toolboxes determines which type of neural network NW uses.

example

NW = idNeuralNetwork(LayerSizes,Activations,UseLinearFcn) specifies whether NW uses a linear function as a subcomponent.

example

NW = idNeuralNetwork(LayerSizes,Activations,UseLinearFcn,UseOffset) specifies whether NW uses an offset term.

example

NW = idNeuralNetwork(___,Name=Value) creates a neural network object with properties specified by one or more name-value arguments.

example

Create Cascade-Correlation Neural Network

Creating cascade-correlation neural networks requires Deep Learning Toolbox software. You can use these cascade-correlation neural networks for estimating nonlinear ARX models but not Hammerstein-Wiener models.

NW = idNeuralNetwork("cascade-correlation") creates a cascade-correlation neural network where the network determines the number of layers during training. Each of these layers has one unit. All units are connected to all previous layers and inputs. For more information on these networks, see Cascade-Correlation Neural Networks.

NW = idNeuralNetwork("cascade-correlation",Activations) specifies the types of activation to use in each layer.

example

NW = idNeuralNetwork("cascade-correlation",Activations,UseLinearFcn) specifies whether NW uses a linear function as a subcomponent.

NW = idNeuralNetwork("cascade-correlation",Activations,UseLinearFcn,UseOffset) specifies whether NW uses an offset term.

NW = idNeuralNetwork(___,Name=Value) creates a cascade-correlation neural network object with properties specified by one or more name-value arguments.

Use Existing Shallow Neural Network

NW = idNeuralNetwork(shallownet) creates NW using the network (Deep Learning Toolbox) object shallownet.

shallownet is typically the output of feedforwardnet (Deep Learning Toolbox), cascadeforwardnet (Deep Learning Toolbox), or linearlayer (Deep Learning Toolbox).

example

NW = idNeuralNetwork(shallownet,[],UseLinearFcn) specifies whether NW uses a linear function as a subcomponent.

example

NW = idNeuralNetwork(shallownet,[],UseLinearFcn,UseOffset) specifies whether NW uses an offset term.

example

Input Arguments

expand all

Number of network layers and activations in each layer, specified as a row vector of positive integers with length equal to the number of layers. Each integer in the vector indicates the number of activations in the corresponding layer. For instance, a value of [10 5 2] corresponds to a three-layer network, with ten activations in the first layer, five in the second, and two in the third.

Activation types to use in each layer, specified as a string array of length equal to the length of LayerSizes. The activation types can be divided into two groups.

  1. Four activation types are available in both Statistics and Machine Learning Toolbox and Deep Learning Toolbox. These activation types are "relu", "tanh", "sigmoid", and "none". Use "none" to specify a linear layer.

  2. The remaining activation types are available only in Deep Learning Toolbox, and consist of "leakyRelu", "clippedRelu", "elu", "gelu", "swish", "softplus", "scaling", and "softmax".

You can also specify hyperparameter values for "leakyRelu", "clippedRelu", "elu", and "scaling". For example:

  • "leakyRelu(0.2)" specifies a leaky ReLu activation layer with a scaling value of 0.2.

  • "clippedRelu(5)" specifies a clipped ReLu activation layer with a ceiling value of 5.

  • "elu(2)" specifies an ELU activation layer with the Alpha property equal to 2.

  • "scaling(0.2,4)" specifies a scaling activation layer with a scale of 0.2 and a bias of 4.

To apply the same set of activations to all layers, specify Activations as a scalar string.

The choice of Activations combined with the availability of Statistics and Machine Learning Toolbox and Deep Learning Toolbox determines which network to use, as the following table shows. In the toolbox availability columns, an entry of "—" indicates that the corresponding toolbox availability does not impact the network selection for that row.

ActivationsStatistics and ML AvailableDL AvailableNetwork Type
Group 1 onlyYesRegressionNeuralNetwork (Statistics and Machine Learning Toolbox) from fitrnet (Statistics and Machine Learning Toolbox)
NoYesDeep network from dlnetwork (Deep Learning Toolbox)
At least one activation from group 2YesDeep network from dlnetwork (Deep Learning Toolbox)

For more information about these activations, as well as additional toolbox requirements for using them, see the Activations properties in RegressionNeuralNetwork (Statistics and Machine Learning Toolbox) and lbfgsupdate (Deep Learning Toolbox), and the Activation Layers section in List of Deep Learning Layers (Deep Learning Toolbox).

Shallow network, specified as a network (Deep Learning Toolbox) object.

shallownet is typically the output of feedforwardnet (Deep Learning Toolbox), cascadeforwardnet (Deep Learning Toolbox), or linearlayer (Deep Learning Toolbox).

This argument sets the value of the NW.

Option to use the linear function subcomponent, specified as true or false. This argument sets the value of the NW.LinearFcn.Use property.

Option to use an offset term, specified as true or false. This argument sets the value of the NW.Offset.Use property.

Name-Value Arguments

expand all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Use name-value arguments to specify the object properties such as the network type, maximum number of activation layers, and size selection.

Example: NetworkType="dlnetwork"

Type of neural network, specified as one of the following:

  • "RegressionNeuralNetwork" — The network is a regression neural network from Statistics and Machine Learning Toolbox.

  • "dlnetwork" — The network is a deep network from Deep Learning Toolbox.

  • "auto" — The network is the default regression neural network.

The network specification using this name-value argument overrides the default automatic activation-based selection of network type described in Activations.

Dependencies

To specify NetworkType, you must specify LayerSizes as a row vector of positive integers.

The maximum number of activation layers in the cascade-correlation neural network, specified as a positive integer.

Dependencies

To specify MaxNumActLayers, you must specify "cascade-correlation" as the first input argument of idNeuralNetwork.

Option to select the number of activation layers in the neural network, specified as either "on" or "off".

If you specify "off", the cascade-correlation network automatically selects its size.

  • If you specify the CrossValidate property of nlarxOptions as "true", the cascade-correlation network chooses the number of activation layers, from the range 0 to MaxNumActLayers, such that the cross-validation residual error is minimized.

  • If you specify the CrossValidate property of nlarxOptions as "false", the cascade-correlation network chooses the number of activation layers, from the range 0 to MaxNumActLayers, such that the nAIC is minimized. For information on nAIC, see Model Quality Metrics.

If you specify "on", the Network Size Selection dialog box opens during training. It shows the training residual error versus the number of activation layers. If you specify the CrossValidate property of nlarxOptions as "true", the dialog box also shows the cross-validation residual error. You can select the number of activation layers in the dialog box. For a multi-output model, you can select one size for all outputs in the dialog box.

Dependencies

To specify SizeSelection, you must specify "cascade-correlation" as the first input argument of idNeuralNetwork.

Properties

expand all

Input signal names for the inputs to the mapping object, specified as a 1-by-m cell array, where m is the number of input signals. This property is determined during estimation.

Output signal name for the output of the mapping object, specified as a 1-by-1 cell array. This property is determined during estimation.

Parameters of the linear function, specified as follows:

  • Use — Option to use the linear function in the mapping object, specified as a scalar logical. The default value is true.

  • Value — Linear weights that compose L', specified as a 1-by-p vector.

  • InputProjection — Input projection matrix P, specified as an m-by-p matrix, that transforms the detrended input vector of length m into a vector of length p. For Hammerstein-Wiener models, InputProjection is equal to 1.

  • Free — Option to update entries of Value during estimation, specified as a 1-by-p logical vector. The software honors the Free specification only if the starting value of Value is finite. The default value is true.

Parameters of the offset term, specified as follows:

  • Use — Option to use the offset in the mapping object, specified as a scalar logical. The default value is true.

  • Value — Offset value, specified as a scalar.

  • Free — Option to update Value during estimation, specified as a scalar logical. The software honors the Free specification of false only if the value of Value is finite. The default value is true.

Parameters of the idNeuralNetwork network function, which are the properties Parameters, Inputs, and Outputs.

Parameters contains the learnable hyperparameters and initial hyperparameter values used by the network:

  • Learnables — Vector of tunable parameters that represent the weights and biases for the network. For each tunable item, you can set the initial value and specify whether the value is fixed or free during training.

  • InputProjection — Parameters of input projection matrix Q, determined during training as an m-by-q matrix. Q transforms the detrended input vector (XX¯) of length m into a vector of length q. Typically, Q has the same dimensions as the linear projection matrix P. In this case, q is equal to p, which is the number of linear weights.

    For Hammerstein-Wiener models, InputProjection is equal to 1.

Estimation options for the idNeuralNetwork model, specified as a structure with the fields Solver and SolverOptions. The set of estimation options for the model depends on what type of network idNeuralNetwork represents. The following tables each present the estimation options for one network type.

To specify an estimation option, use dot notation after creating NW. For example, to reduce the iteration limit for a regression neural network to 500, use NW.EstimationOptions.SolverOptions.IterationLimit = 500.

Regression Neural Network

When NW uses a regression neural network, the solver is fixed to "LBFGS". For more information on the solver, see the Algorithms section in trainingOptions (Deep Learning Toolbox).

SolverSolver Options
LBFGS
Option DescriptionSpecification and Default Value
IterationLimitMaximum number of iterations to use for training.

Positive integer (default 1000)

GradientToleranceRelative gradient tolerance. The software stops training when the relative gradient is less than or equal to GradientTolerance.

0 or positive number (default 1e-6)

LossToleranceLoss Tolerance. The software stops training when the function loss at some iteration is less than or equal to LossTolerance.

0 or positive number (default 1e-6)

StepToleranceStep size tolerance. The software stops training when the step that the algorithm takes is less than or equal to StepTolerance.

0 or positive number (default 1e-6)

LambdaConstant coefficient applied to the regularization term added to the loss function. For more information, see Regularized Estimates of Model Parameters.

Nonnegative scalar (default 0)

LayerWeightsInitializerWeight value initialization scheme. For more information, see LayerWeightsInitializer (Statistics and Machine Learning Toolbox).

Either 'glorot'(default) or 'he'

LayerBiasesInitializer (bias value initialization scheme)Bias value initialization scheme. For more information, see LayerBiasesInitializer (Statistics and Machine Learning Toolbox).

Either 'zeros' or 'ones'(default)

Deep Learning Network

When NW uses a deep learning network (dlnetwork (Deep Learning Toolbox)), the solver choices are LBFGS (default), SGDM, ADAM, and RMSProp. The following tables show the solver options for each solver type. For more information on the solvers, see the Algorithms section in trainingOptions (Deep Learning Toolbox).

SolverSolver Options
LBFGS
Option DescriptionSpecification and Default Value
LineSearchMethod

Method to find suitable learning rate, specified as one of these values:

  • "weak-wolfe" — Search for a learning rate that satisfies the weak Wolfe conditions. This method maintains a positive definite approximation of the inverse Hessian matrix.

  • "strong-wolfe" — Search for a learning rate that satisfies the strong Wolfe conditions. This method maintains a positive definite approximation of the inverse Hessian matrix.

  • "backtracking" — Search for a learning rate that satisfies sufficient decrease conditions. This method does not maintain a positive definite approximation of the inverse Hessian matrix.

"weak-wolfe" (default), "strong-wolfe", "backtracking"
MaxNumLineSearchIterationsMaximum number of line search iterations to determine the learning rate.Positive, finite, integer (default 20)
HistorySizeNumber of state updates to store, specified as a positive integer. Values between 3 and 20 suit most tasks. For more information, see Limited-Memory BFGS (Deep Learning Toolbox).Positive integer (default 10)
InitialInverseHessianFactorInitial value that characterizes the approximate inverse Hessian matrix. For more information, see Limited-Memory BFGS (Deep Learning Toolbox).Positive, finite, real number (default 1)
MaxIterationsMaximum number of iterations to use for training.Positive, finite, integer (default 100)
GradientToleranceRelative gradient tolerance. The software stops training when the relative gradient is less than or equal to GradientTolerance.

0 or positive number (default 1e-6)

StepToleranceStep size tolerance. The software stops training when the step that the algorithm takes is less than or equal to StepTolerance.

0 or positive number (default 1e-6)

LambdaConstant coefficient applied to the regularization term added to the loss function. For more information, see Regularized Estimates of Model Parameters.

Nonnegative scalar (default 0)

LayerWeightsInitializerWeight value initialization scheme. For more information, see LayerWeightsInitializer (Statistics and Machine Learning Toolbox).

Either 'glorot'(default) or 'he'

LayerBiasesInitializer Bias value initialization scheme. For more information, see LayerBiasesInitializer (Statistics and Machine Learning Toolbox).

Either 'zeros' or 'ones'(default)

SolverSolver Options
SGDM
Option DescriptionSpecification and Default Value
LearnRateLearning rate used for training. If the learning rate is too small, then training can take a long time. If the learning rate is too large, then training might reach a suboptimal result or diverge.Positive scalar (default 0.01)
MomentumContribution of the parameter update step of the previous iteration to the current iteration of stochastic gradient descent with momentum. For more information, see Stochastic Gradient Descent with Momentum (Deep Learning Toolbox).Positive scalar less than or equal to 1 (default 0.95)
LambdaConstant coefficient applied to the regularization term added to the loss function. For more information, see Regularized Estimates of Model Parameters.

Nonnegative scalar (default 0)

MaxEpochsMaximum number of epochs to use for training. An epoch is the full pass of the training algorithm over the entire training set.Positive integer (default 100)
MiniBatchSizeSize of the mini-batch to use for each training iteration. A mini-batch is a subset of the training set that is used to evaluate the gradient of the loss function and update the weights.Positive integer (default 1000)
LayerWeightsInitializerWeight value initialization scheme. For more information, see LayerWeightsInitializer (Statistics and Machine Learning Toolbox).

Either 'glorot'(default) or 'he'

LayerBiasesInitializer Bias value initialization scheme. For more information, see LayerBiasesInitializer (Statistics and Machine Learning Toolbox).

Either 'zeros' or 'ones'(default)

SolverSolver Options
ADAM
Option DescriptionSpecification and Default Value
LearnRateLearning rate used for training. If the learning rate is too small, then training can take a long time. If the learning rate is too large, then training might reach a suboptimal result or diverge.Positive scalar (default 0.001)
GradientDecayFactorDecay rate of gradient moving average for the Adam solver. For more information, see Adaptive Moment Estimation (Deep Learning Toolbox).Positive scalar less than 1 (default 0.9)
SquaredGradientDecayFactorDecay rate of squared gradient moving average for the Adam solver. Typical values of the decay rate are 0.9, 0.99, and 0.999, corresponding to averaging lengths of 10, 100, and 1000 parameter updates, respectively. For more information, see Adaptive Moment Estimation (Deep Learning Toolbox).Positive scalar less than 1 (default 0.999)
LambdaConstant coefficient applied to the regularization term added to the loss function. For more information, see Regularized Estimates of Model Parameters.

Nonnegative scalar (default 0)

MaxEpochsMaximum number of epochs to use for training. An epoch is the full pass of the training algorithm over the entire training set.Positive integer (default 100)
MiniBatchSizeSize of the mini-batch to use for each training iteration. A mini-batch is a subset of the training set that is used to evaluate the gradient of the loss function and update the weights.Positive integer (default 1000)
LayerWeightsInitializerWeight value initialization scheme. For more information, see LayerWeightsInitializer (Statistics and Machine Learning Toolbox).

Either 'glorot'(default) or 'he'

LayerBiasesInitializer Bias value initialization scheme. For more information, see LayerBiasesInitializer (Statistics and Machine Learning Toolbox).

Either 'zeros' or 'ones'(default)

SolverSolver Options
RMSProp
Option DescriptionSpecification and Default Value
LearnRateLearning rate used for training. If the learning rate is too small, then training can take a long time. If the learning rate is too large, then training might reach a suboptimal result or diverge.Positive scalar (default 0.001)
SquaredGradientDecayFactorDecay rate of squared gradient moving average for the RMSProp solver. Typical values of the decay rate are 0.9, 0.99, and 0.999, corresponding to averaging lengths of 10, 100, and 1000 parameter updates, respectively. For more information, see Root Mean Square Propagation (Deep Learning Toolbox).Positive scalar less than 1 (default 0.9)
LambdaConstant coefficient applied to the regularization term added to the loss function. For more information, see Regularized Estimates of Model Parameters.

Nonnegative scalar (default 0)

MaxEpochsMaximum number of epochs to use for training. An epoch is the full pass of the training algorithm over the entire training set.Positive integer (default 100)
MiniBatchSizeSize of the mini-batch to use for each training iteration. A mini-batch is a subset of the training set that is used to evaluate the gradient of the loss function and update the weights.Positive integer (default 1000)
LayerWeightsInitializerWeight value initialization scheme. For more information, see LayerWeightsInitializer (Statistics and Machine Learning Toolbox).

Either 'glorot'(default) or 'he'

LayerBiasesInitializer Bias value initialization scheme. For more information, see LayerBiasesInitializer (Statistics and Machine Learning Toolbox).

Either 'zeros' or 'ones'(default)

Cascade-Correlation Neural Network

When NW uses a cascade-correlation neural network, the available solvers and their options are the same as those for a deep learning network. Additionally, the structure for Estimation Options consists of two more fields:

  • MaxNumActLayers — The maximum number of activation layers in the cascade-correlation neural network, specified as a positive integer.

  • SizeSelection — Option to select the number of activation layers in the neural network, specified as either "on" or "off".

    If you specify "off", the cascade-correlation network automatically selects its size. If you specify "on", the Network Size Selection dialog box opens during training. It shows the training residual error versus the number of activation layers. If you specify the CrossValidate property of nlarxOptions as "true", the dialog box also shows the cross-validation residual error. You can select the number of activation layers in the dialog box. For a multi-output model, you can select one size for all outputs in the dialog box.

Shallow Network

When NW uses an existing shallow neural network, the solvers are equivalent to the training functions in Deep Learning Toolbox. The corresponding options are the same for all training function choices. The default solver is trainlm and the options for this solver are described below. For information on available shallow network training functions and their associated algorithms, see Train and Apply Multilayer Shallow Neural Networks (Deep Learning Toolbox).

SolverSolver Options
Any shallow network training function
Option DescriptionSpecification and Default Value
showWindowShow training GUI.

Boolean (scalar) (default 1)

showCommandLineGenerate command-line output.

Boolean (scalar) (default 0)

showEpochs between displays (NaN for no displays).

Positive integer (default 25)

epochsMaximum number of epochs to train.

Positive integer (default 1000)

timeMaximum time to train in seconds.

Positive scalar (default Inf)

goalPerformance goal.

Nonnegative scalar (default 0)

min_grad Minimum performance gradient.

Positive scalar (default 1e-07)

max_fail Maximum validation failures.

Positive integer (default 6)

mu Initial mu.

Positive scalar (default 0.001)

mu_dec Decrease factor for mu.Positive number less than or equal to 1 (default 0.1)
mu_inc Increase factor for mu.Number greater than 1 (default 10)
mu_max Maximum value for mu.Positive scalar (default 1e+10)

Examples

collapse all

Create an idNeuralNetwork object with default properties.

NW = idNeuralNetwork
Constructing a RegressionNeuralNetwork object from the Statistics and Machine Learning Toolbox... 
If you want to use a deep network representation, specify NetworkType="dlnetwork" in the idNeuralNetwork constructor.
NW = 
Multi-Layer Neural Network

 Nonlinear Function: Uninitialized regression neural network
         Contains 1 hidden layers using "relu" activations.
         (uses Statistics and Machine Learning Toolbox)
 Linear Function: uninitialized
 Output Offset: uninitialized

              Network: 'Regression neural network parameters'
            LinearFcn: 'Linear function parameters'
               Offset: 'Offset parameters'
    EstimationOptions: [1×1 struct]

NW is a regression neural network with a single layer of relu activations.

Specify a network that uses two hidden layers of sizes 5 and 3 respectively. Specify that both layers use tanh activations.

NW = idNeuralNetwork([5 3],"tanh");
Constructing a RegressionNeuralNetwork object from the Statistics and Machine Learning Toolbox... 
If you want to use a deep network representation, specify NetworkType="dlnetwork" in the idNeuralNetwork constructor.
disp(NW.Network)
Regression neural network parameters

    Parameters: 'Learnables and hyperparameters'
        Inputs: {1×0 cell}
       Outputs: {1×0 cell}

This example assumes that you have access to Statistics and Machine Learning Toolbox, but will also run with Deep Learning Toolbox. If you have access to both toolboxes, then NW is a regression network. If you have access to only Deep Learning Toolbox, then NW is a deep network.

Create a network that contains three hidden layers. The first layer uses 10 relu activations, the second layer uses 5 tanh activations, and the third layer uses swish activations.

  NW = idNeuralNetwork([10 5 2],["relu", "tanh", "swish"])
NW = 
Multi-Layer Neural Network

 Nonlinear Function: Deep learning network
         Contains 3 hidden layers using "relu", "tanh", "swish" activations.
         (uses Deep Learning Toolbox)
 Linear Function: uninitialized
 Output Offset: uninitialized

              Network: 'Deep learning network parameters'
            LinearFcn: 'Linear function parameters'
               Offset: 'Offset parameters'
    EstimationOptions: [1×1 struct]

The swish network requires Deep Learning Toolbox. Therefore, NW is a deep network whether or not you also have Statistics and Machine Learning Toolbox.

Create an idNeuralNetwork object that has no linear function or offset.

UseLinear = false;
UseOffset = false;
NW = idNeuralNetwork(5,"relu",UseLinear,UseOffset);
Constructing a RegressionNeuralNetwork object from the Statistics and Machine Learning Toolbox... 
If you want to use a deep network representation, specify NetworkType="dlnetwork" in the idNeuralNetwork constructor.
disp(NW.Linear)
Linear Function: not in use
              Value: [1×0 double]
               Free: [1×0 logical]
                Use: 0
             Inputs: {1×0 cell}
            Outputs: {1×0 cell}
    InputProjection: []
disp(NW.Offset)
Output Offset: not in use
      Use: 0
    Value: NaN
     Free: 1

NW does not use the linear function or offset.

Create a network function with default settings, but enforce that the function be based on the deep network architecture.

NW = idNeuralNetwork(NetworkType="dlnetwork")
NW = 
Multi-Layer Neural Network

 Nonlinear Function: Deep learning network
         Contains 1 hidden layers using "relu" activations.
         (uses Deep Learning Toolbox)
 Linear Function: uninitialized
 Output Offset: uninitialized

              Network: 'Deep learning network parameters'
            LinearFcn: 'Linear function parameters'
               Offset: 'Offset parameters'
    EstimationOptions: [1×1 struct]

The network specification overrides the default selection of a regression network.

Create a cascade-correlation neural network using idNeuralNetwork. Specify the activation function to be "tanh" for all layers.

NW = idNeuralNetwork("cascade-correlation","tanh")
NW = 
Multi-Layer Neural Network

 Nonlinear Function: Deep learning network
         Cascade Correlation network using "tanh" activations.
         (uses Deep Learning Toolbox)
 Linear Function: uninitialized
 Output Offset: uninitialized

              Network: 'Deep learning network parameters'
            LinearFcn: 'Linear function parameters'
               Offset: 'Offset parameters'
    EstimationOptions: [1×1 struct]

disp(NW.EstimationOptions.MaxNumActLayers)
    20

Change the maximum number of activation layers in the network to 10.

NW.EstimationOptions.MaxNumActLayers = 10;

Construct an idNeuralNetwork object that uses a deep learning shallow network.

Create a feedforward (shallow) network that uses three hidden layers with four, six, and one neurons, respectively.

snet = feedforwardnet([4 6 1]);

Specify the transfer functions for the hidden layers. The output layer uses the default transfer function 'purelin'.

   snet.layers{1}.transferFcn = 'logsig';
   snet.layers{2}.transferFcn = 'radbas';
   snet.layers{3}.transferFcn = 'purelin';

Incorporate snet into the idNeuralNetwork object NW.

   NW = idNeuralNetwork(snet)
NW = 
Multi-Layer Neural Network

 Nonlinear Function: Uninitialized shallow network
         (uses Deep Learning Toolbox)
 Linear Function: uninitialized
 Output Offset: uninitialized

              Network: 'Shallow network parameters'
            LinearFcn: 'Linear function parameters'
               Offset: 'Offset parameters'
    EstimationOptions: [1×1 struct]

Identify a nonlinear ARX model that uses a regression neural network to describe the regressor-to-output mapping.

Load the data, which consists of the column vectors u and y. Convert the data into a timetable tt with a sample time of 0.8 minutes.

load twotankdata u y
tt = timetable(y,u,TimeStep=minutes(0.8));

Split tt into estimation (training) and validation data sets tte and ttv.

tte = tt(1:1000,:);
ttv = tt(1001:2000,:);

Specify estimation and search options.

  opt = nlarxOptions(Focus="simulation",Display="on",SearchMethod="fmincon");
  opt.SearchOptions.MaxIterations = 10;

Create a regression network NW that uses two hidden layers with five activations each. Use sigmoid for the activations in the first layer and tanh for activations in the second layer.

NW = idNeuralNetwork([5 5],["sigmoid","tanh"]);
Constructing a RegressionNeuralNetwork object from Statistics and Machine Learning Toolbox... 
If you want to use a deep network representation, specify NetworkType="dlnetwork" in the idNeuralNetwork constructor.

Estimate a nonlinear ARX model that uses NW as the output function and uses y as the output variable.

sys = nlarx(tte,[2 2 1],NW,opt,OutputName="y");

Using the validation data, compare the measured output with the model output.

compare(ttv,sys)

Figure contains an axes object. The axes object with ylabel y contains 2 objects of type line. These objects represent Validation data (y), sys: 88.86%.

The nonlinear model shows a good fit to the estimation data.

Identify a nonlinear ARX model that uses a cascade-forward shallow network.

Load the data, which consists of the input and output arrays u and y, respectively. Convert the data into the timetable tt with a sample time of 0.8 minutes.

load twotankdata
tte = timetable(u,y,TimeStep=minutes(0.8));

Create a cascade-forward shallow network with a single hidden layer.

cnet = cascadeforwardnet(20);

Construct an idNeuralNetwork object NW that incorporates cnet and excludes the linear and offset elements.

NW = idNeuralNetwork(cnet,[],false,false); 

Specify estimation and search options.

opt = nlarxOptions(SearchMethod="gna");
opt.SearchOptions.MaxIterations = 2;

Estimate the nlarx model sys and compare the model output with the measured data.

sys = nlarx(tte,[2 2 1],NW,opt);

Figure Neural Network Training (01-Feb-2025 14:22:45) contains an object of type uigridlayout.

compare(tte,sys)

Figure contains an axes object. The axes object with ylabel y contains 2 objects of type line. These objects represent Validation data (y), sys: 87.61%.

Algorithms

The learnable parameters of the idNeuralNetwork function are determined during estimation of the nonlinear ARX and Hammerstein-Wiener models, using nlarx and nlhw commands, respectively.

The software initializes these parameters using the following steps:

  1. Determine the linear function coefficients L and the offset y0, if in use and free, by performing a least-squares fit to the data.

  2. Initialize the learnable parameters of the network function by fitting the residues of the linear and offset terms from step 1. The initialization scheme depends upon the type of the underlying network:

    • For RegressionNeuralNetwork (Statistics and Machine Learning Toolbox) networks, use fitrnet (Statistics and Machine Learning Toolbox).

    • For dlnetwork (Deep Learning Toolbox) networks, perform initialization by training the network using the specified solver in NW.EstimationOptions. For cascade-correlation neural networks, perform initialization by training a network using the options specified in NW.EstimationOptions and nlarxOptions.

    • For network (Deep Learning Toolbox) networks, perform initialization by training the network using the specified solver in NW.EstimationOptions.

After initialization, the software updates the parameters using a nonlinear least-squares optimization solver (see SearchMethod in nlarxOptions and SearchOptions in nlhwOptions) to minimize the chosen objective, as the following objective summaries describe:

  • For nonlinear ARX models, the objective is either prediction-error minimization or simulation-error minimization, depending on whether the Focus option in nlarxOptions is "prediction" or "simulation".

  • For Hammerstein-Wiener models, the objective is simulation-error-norm minimization.

See nlarxOptions and nlhwOptions for more information on how to configure the objective and search method.

Version History

Introduced in R2023b

expand all