Using the Levenberg–Marquardt optimizer in deep learning

xingxingcui il 5 Set 2025

Since R2024b, a Levenberg–Marquardt solver (TrainingOptionsLM) was introduced. The built‑in function trainnet now accepts training options via the trainingOptions function (https://www.mathworks.com/help/deeplearning/ref/trainingoptions.html#bu59f0q-2) and supports the LM algorithm. I have been curious how to use it in deep learning, and the official documentation has not provided a concrete usage example so far. Below I give a simple example to illustrate how to use this LM algorithm to optimize a small number of learnable parameters.
For example, consider the nonlinear function:
y_hat = @(a,t) a(1)*(t/100) + a(2)*(t/100).^2 + a(3)*(t/100).^3 + a(4)*(t/100).^4;
It represents a curve. Given 100 matching points (t → y_hat), we want to use least squares to estimate the four parameters a1​–a4​.
t = (1:100)';
y_hat = @(a,t)a(1)*(t/100) + a(2)*(t/100).^2 + a(3)*(t/100).^3 + a(4)*(t/100).^4;
x_true = [ 20 ; 10 ; 1 ; 50 ];
y_true = y_hat(x_true,t);
plot(t,y_true,'o-')
  • Using the traditional lsqcurvefit-wrapped "Levenberg–Marquardt" algorithm:
x_guess = [ 5 ; 2 ; 0.2 ; -10 ];
options = optimoptions("lsqcurvefit",Algorithm="levenberg-marquardt",MaxFunctionEvaluations=800);
[x,resnorm,residual,exitflag] = lsqcurvefit(y_hat,x_guess,t,y_true,-50*ones(4,1),60*ones(4,1),options);
Local minimum found. Optimization completed because the size of the gradient is less than 1e-4 times the value of the function tolerance.
x,resnorm,exitflag
x = 4×1
20.0000 10.0000 1.0000 50.0000
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
resnorm = 9.7325e-20
exitflag = 1
  • Using the deep-learning-wrapped "Levenberg–Marquardt" algorithm:
options = trainingOptions("lm", ...
InitialDampingFactor=0.002, ...
MaxDampingFactor=1e9, ...
DampingIncreaseFactor=12, ...
DampingDecreaseFactor=0.2,...
GradientTolerance=1e-6, ...
StepTolerance=1e-6,...
Plots="training-progress");
numFeatures = 1;
layers = [featureInputLayer(numFeatures,'Name','input')
fitCurveLayer(Name='fitCurve')];
net = dlnetwork(layers);
XData = dlarray(t);
YData = dlarray(y_true);
netTrained = trainnet(XData,YData,net,"mse",options);
Iteration TimeElapsed TrainingLoss GradientNorm StepNorm _________ ___________ ____________ ____________ ________ 1 00:00:03 0.35754 0.053592 39.649
Warning: Error occurred while executing the listener callback for event LogUpdate defined for class deep.internal.train.SerialMetricManager:
Error using matlab.internal.capability.Capability.require (line 94)
This functionality is not available on remote platforms.

Error in matlab.ui.internal.uifigureImpl (line 33)
Capability.require(Capability.WebWindow);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in uifigure (line 34)
window = matlab.ui.internal.uifigureImpl(false, varargin{:});
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in deepmonitor.internal.DLTMonitorView/createGUIComponents (line 167)
this.Figure = uifigure("Tag", "DEEPMONITOR_UIFIGURE");
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in deepmonitor.internal.DLTMonitorView (line 123)
this.createGUIComponents();
^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in deepmonitor.internal.DLTMonitorFactory/createStandaloneView (line 8)
view = deepmonitor.internal.DLTMonitorView(model, this);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in deep.TrainingProgressMonitor/set.Visible (line 224)
this.View = this.Factory.createStandaloneView(this.Model);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in deep.internal.train.MonitorConfiguration/updateMonitor (line 173)
monitor.Visible = true;
^^^^^^^^^^^^^^^
Error in deep.internal.train.MonitorConfiguration>@(logger,evtData)weakThis.Handle.updateMonitor(evtData,visible) (line 154)
this.Listeners{end+1} = listener(logger,'LogUpdate',@(logger,evtData) weakThis.Handle.updateMonitor(evtData,visible));
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in deep.internal.train.SerialMetricManager/notifyLogUpdate (line 28)
notify(this,'LogUpdate',eventData);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in deep.internal.train.MetricManager/evaluateMetricsAndSendLogUpdate (line 177)
notifyLogUpdate(this, logUpdateEventData);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in deep.internal.train.setupTrainnet>iEvaluateMetricsAndSendLogUpdate (line 140)
evaluateMetricsAndSendLogUpdate(metricManager, evtData);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in deep.internal.train.setupTrainnet>@(source,evtData)iEvaluateMetricsAndSendLogUpdate(source,evtData,metricManager) (line 125)
addlistener(trainer,'IterationEnd',@(source,evtData) iEvaluateMetricsAndSendLogUpdate(source,evtData,metricManager));
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in deep.internal.train.BatchTrainer/notifyIterationAndEpochEnd (line 189)
notify(trainer,'IterationEnd',data);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in deep.internal.train.FullBatchTrainer/computeBatchTraining (line 112)
notifyIterationAndEpochEnd(trainer, matlab.lang.internal.move(data));
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in deep.internal.train.BatchTrainer/computeTraining (line 144)
net = computeBatchTraining(trainer, net, mbq);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in deep.internal.train.Trainer/train (line 67)
net = computeTraining(trainer, net, mbq);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in deep.internal.train.train (line 30)
net = train(trainer, net, mbq);
^^^^^^^^^^^^^^^^^^^^^^^^
Error in trainnet (line 51)
[varargout{1:nargout}] = deep.internal.train.train(mbq, net, loss, options, ...
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in LiveEditorEvaluationHelperEeditorId (line 27)
netTrained = trainnet(XData,YData,net,"mse",options);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in connector.internal.fevalMatlab

Error in connector.internal.fevalJSON
7 00:00:04 5.3382e-10 1.4371e-07 0.43992 Training stopped: Gradient tolerance reached
netTrained.Layers(2)
ans =
fitCurveLayer with properties: Name: 'fitCurve' Learnable Parameters a1: 20.0007 a2: 9.9957 a3: 1.0072 a4: 49.9962 State Parameters No properties. Use properties method to see a list of all properties.
classdef fitCurveLayer < nnet.layer.Layer ...
& nnet.layer.Acceleratable
% Example custom SReLU layer.
properties (Learnable)
% Layer learnable parameters
a1
a2
a3
a4
end
methods
function layer = fitCurveLayer(args)
arguments
args.Name = "lm_fit";
end
% Set layer name.
layer.Name = args.Name;
% Set layer description.
layer.Description = "fit curve layer";
end
function layer = initialize(layer,~)
% layer = initialize(layer,layout) initializes the layer
% learnable parameters using the specified input layout.
if isempty(layer.a1)
layer.a1 = rand();
end
if isempty(layer.a2)
layer.a2 = rand();
end
if isempty(layer.a3)
layer.a3 = rand();
end
if isempty(layer.a4)
layer.a4 = rand();
end
end
function Y = predict(layer, X)
% Y = predict(layer, X) forwards the input data X through the
% layer and outputs the result Y.
% Y = layer.a1.*exp(-X./layer.a2) + layer.a3.*X.*exp(-X./layer.a4);
Y = layer.a1*(X/100) + layer.a2*(X/100).^2 + layer.a3*(X/100).^3 + layer.a4*(X/100).^4;
end
end
end
The network is very simple — only the fitCurveLayer defines the learnable parameters a1–a4. I observed that the output values are very close to those from lsqcurvefit.