dlgradient
Compute gradients for custom training loops using automatic differentiation
Syntax
Description
The dlgradient function computes derivatives using automatic differentiation.
Tip
For most deep learning tasks, you can use a pretrained neural network and adapt it to your own
        data. For an example showing how to use transfer learning to retrain a convolutional neural
        network to classify a new set of images, see Retrain Neural Network to Classify New Images. Alternatively, you can
        create and train neural networks from scratch using the trainnet and
            trainingOptions functions.
If the trainingOptions function does not provide the
        training options that you need for your task, then you can create a custom training loop
        using automatic differentiation. To learn more, see Train Network Using Custom Training Loop.
If the trainnet
        function does not provide the loss function that you need for your task, then you can
        specify a custom loss function to the trainnet as a function handle.
        For loss functions that require more inputs than the predictions and targets (for example,
        loss functions that require access to the neural network or additional inputs), train the
        model using a custom training loop. To learn more, see Train Network Using Custom Training Loop.
If Deep Learning Toolbox™ does not provide the layers you need for your task, then you can create a custom layer. To learn more, see Define Custom Deep Learning Layers. For models that cannot be specified as networks of layers, you can define the model as a function. To learn more, see Train Network Using Model Function.
For more information about which training method to use for which task, see Train Deep Learning Model in MATLAB.
[
        returns the gradients of dydx1,...,dydxk] = dlgradient(y,x1,...,xk)y with respect to the variables
          x1 through xk.
Call dlgradient from inside a function passed to
          dlfeval. See Compute Gradient Using Automatic Differentiation and Use Automatic Differentiation In Deep Learning Toolbox.
[
        returns the gradients and specifies additional options using one or more name-value pairs.
        For example, dydx1,...,dydxk] = dlgradient(y,x1,...,xk,Name,Value)dydx = dlgradient(y,x,'RetainData',true) causes the gradient
        to retain intermediate values for reuse in subsequent dlgradient calls.
        This syntax can save time, but uses more memory. For more information, see Tips.
Examples
Input Arguments
Name-Value Arguments
Output Arguments
Limitations
- The - dlgradientfunction does not support calculating higher-order derivatives when using- dlnetworkobjects containing custom layers with a custom backward function.
- The - dlgradientfunction does not support calculating higher-order derivatives when using- dlnetworkobjects containing the following layers:- gruLayer
- lstmLayer
- bilstmLayer
 
- The - dlgradientfunction does not support calculating higher-order derivatives that depend on the following functions:- gru
- lstm
- embed
- prod
- interp1
 
More About
Tips
- A - dlgradientcall must be inside a function. To obtain a numeric value of a gradient, you must evaluate the function using- dlfeval, and the argument to the function must be a- dlarray. See Use Automatic Differentiation In Deep Learning Toolbox.
- To enable the correct evaluation of gradients, the - yargument must use only supported functions for- dlarray. See List of Functions with dlarray Support.
- If you set the - 'RetainData'name-value pair argument to- true, the software preserves tracing for the duration of the- dlfevalfunction call instead of erasing the trace immediately after the derivative computation. This preservation can cause a subsequent- dlgradientcall within the same- dlfevalcall to be executed faster, but uses more memory. For example, in training an adversarial network, the- 'RetainData'setting is useful because the two networks share data and functions during training. See Train Generative Adversarial Network (GAN).
- When you need to calculate first-order derivatives only, ensure that the - 'EnableHigherDerivatives'option is- falseas this is usually quicker and requires less memory.
- Complex gradients are calculated using the Wirtinger derivative. The gradient is defined in the direction of increase of the real part of the function to differentiate. This is because the variable to differentiate — for example, the loss — must be real, even if the function is complex. 
- To speed up calls to deep learning functions, such as model functions and model loss functions, you can use the - dlacceleratefunction. The function returns an- AcceleratedFunctionobject that automatically optimizes, caches, and reuses the traces.
Extended Capabilities
Version History
Introduced in R2019b
See Also
dlarray | dlfeval | dlnetwork | dljacobian | dldivergence | dllaplacian | dlaccelerate

