Hessian Multiply Function for fmincon interior-point Algorithm
This example shows how to use a Hessian multiply function with the fmincon "interior-point" algorithm. A Hessian multiply function can save memory and time by giving the result of multiplying the Hessian of the Jacobian with an arbitrary vector but without generating the full Hessian. The syntax of a Hessian multiply function is
W = HessMultFcn(x,lambda,v)
x is the current point, lambda is the Lagrange multiplier structure, and v is a vector that fmincon passes to HessMultFcn. The output W that the function calculates should be H*v, where H is the Hessian of the Lagrangian:
Here, is the objective function, is the nonlinear inequality constraint function that is nonpositive when the constraints are satisfied, and is the nonlinear equality constraint function that is zero when the constraints are satisfied.
Objective and Constraint Functions
Use a multidimensional version of the Rosenbrock function for an even number of dimensions :
The gradient of is
.
The nonlinear constraint is that the sum of the squares of x is no more than :
The gradient of this function is .
Express these functions as MATLAB programs.
function [f,g] = multirosenbrock(x) % Get the problem size n = length(x); if mod(n,2) ~= 0 error('Input vector, x ,must have an even number of components.'); end % Evaluate the vector function odds = (1:2:n)'; evens = (2:2:n)'; F = zeros(n,1); F(odds,1) = 1-x(odds); F(evens,1) = 10.*(x(evens)-x(odds).^2); f = sum(F.^2); if nargout >= 2 % Calculate gradient g = zeros(n,1); g(evens) = 200*(x(evens)-x(odds).^2); g(odds) = -2*(1 - x(odds)) - 400.*(x(evens)-x(odds).^2).*x(odds); end end function [ineqnonlin,eqnonlin,gradineq,gradeq] = cons(x) eqnonlin = []; N = length(x); ineqnonlin = sum(x.^2) - N/2; if nargout > 2 gradeq = []; gradineq = 2*x; end end
Hessian and Hessian Multiply Function
The Hessian of the objective function is a block diagonal matrix with entries
.
The Hessian is twice the identity matrix.
Write a function HessMultFcn that returns the result for a vector .
function W = HessMultFcn(x,lambda,v) v = v(:); n = length(x); x = x(:); if mod(n,2) ~= 0 error('Input vector, x ,must have an even number of components.'); end m = n/2; j = (1:m)'; O = 2*j-1; E = 2*j; tmp1 = 2 - 400*x(E) + 1200*x(O).^2; tmp2 = -400*x(O); HessfTimesV = zeros(n, 1); HessfTimesV(O) = tmp1.*v(O) + tmp2.*v(E); HessfTimesV(E) = tmp2.*v(O) + 200*v(E); % Return Hessian times v W = HessfTimesV + 2*lambda.ineqnonlin*v; end
Difficulty Without Setting Options
The default fmincon options cause the solver to fail to solve this large problem.
Set the number of variables to 1000.
n = 1000;
Create a pseudorandom start point x0 with n components.
rng default % For reproducibility x0 = randn(n,1);
Solve the problem using the default options.
[x1,fval1,eflag1,output1] = fmincon(@multirosenbrock,x0,[],[],[],[],[],[],@cons)
Solver stopped prematurely. fmincon stopped because it exceeded the function evaluation limit, options.MaxFunctionEvaluations = 3.000000e+03.
x1 = 1000×1
0.9958
1.9081
-0.2722
1.5397
0.2169
-1.2915
-0.5438
0.3822
-4.4181
4.7854
-2.0833
3.4741
0.7097
0.0370
0.6668
⋮
fval1 = 6.3932e+04
eflag1 = 0
output1 = struct with fields:
iterations: 2
funcCount: 3010
constrviolation: 312.2817
stepsize: 7.9055
algorithm: 'interior-point'
firstorderopt: 2.6050e+04
cgiterations: 2
message: 'Solver stopped prematurely.↵↵fmincon stopped because it exceeded the function evaluation limit,↵options.MaxFunctionEvaluations = 3.000000e+03.'
bestfeasible: []
To attempt to get a better solution, set the SpecifyObjectiveGradient and SpecifyConstraintGradient options to true and try again. Time the solver.
opts1 = optimoptions("fmincon",... SpecifyObjectiveGradient=true,... SpecifyConstraintGradient=true); tic [x1,fval1,eflag1,output1] = fmincon(@multirosenbrock,x0,[],[],[],[],[],[],@cons,opts1)
Solver stopped prematurely. fmincon stopped because it exceeded the function evaluation limit, options.MaxFunctionEvaluations = 3.000000e+03.
x1 = 1000×1
0.7890
0.6218
0.7836
0.6132
0.7863
0.6175
0.7863
0.6174
0.7848
0.6152
0.7847
0.6151
0.7866
0.6179
0.7866
⋮
fval1 = 22.8431
eflag1 = 0
output1 = struct with fields:
iterations: 651
funcCount: 3000
constrviolation: 0
stepsize: 0.0148
algorithm: 'interior-point'
firstorderopt: 0.2428
cgiterations: 6657
message: 'Solver stopped prematurely.↵↵fmincon stopped because it exceeded the function evaluation limit,↵options.MaxFunctionEvaluations = 3.000000e+03.'
bestfeasible: [1×1 struct]
toc
Elapsed time is 75.380747 seconds.
The solver does not reach an accurate solution, and takes a great deal of time.
Solve Nonlinear Problem With Constraint Using Hessian Multiply Function
To solve a problem using a Hessian multiply function, you must use appropriate options.
options = optimoptions("fmincon",... Algorithm="interior-point",... SpecifyObjectiveGradient=true,... SpecifyConstraintGradient=true,... SubproblemAlgorithm="cg",... HessianMultiplyFcn=@HessMultFcn);
Minimize the multirosenbrock function subject to the nonlinear constraint.
tic [x,fval,eflag,output] = fmincon(@multirosenbrock,x0,[],[],[],[],[],[],@cons,options)
Local minimum found that satisfies the constraints. Optimization completed because the objective function is non-decreasing in feasible directions, to within the value of the optimality tolerance, and constraints are satisfied to within the value of the constraint tolerance. <stopping criteria details>
x = 1000×1
0.7864
0.6177
0.7864
0.6177
0.7864
0.6177
0.7864
0.6177
0.7864
0.6177
0.7864
0.6177
0.7864
0.6177
0.7864
⋮
fval = 22.8374
eflag = 1
output = struct with fields:
iterations: 28
funcCount: 870
constrviolation: 0
stepsize: 3.8640e-07
algorithm: 'interior-point'
firstorderopt: 2.0000e-08
cgiterations: 788
message: 'Local minimum found that satisfies the constraints.↵↵Optimization completed because the objective function is non-decreasing in ↵feasible directions, to within the value of the optimality tolerance,↵and constraints are satisfied to within the value of the constraint tolerance.↵↵<stopping criteria details>↵↵Optimization completed: The relative first-order optimality measure, 2.000017e-08,↵is less than options.OptimalityTolerance = 1.000000e-06, and the relative maximum constraint↵violation, 0.000000e+00, is less than options.ConstraintTolerance = 1.000000e-06.'
bestfeasible: [1×1 struct]
tmult = toc
tmult = 0.2230
fmincon solves the 1000 variable problem using about 30 iterations and 900 function evaluations.
Compare to LBFGS Hessian
The fmincon "lbfgs" Hessian approximation provides an alternative approach to saving memory that does not require you to create a Hessian multiply function.
optslbfgs = optimoptions("fmincon",SpecifyObjectiveGradient=true,... SpecifyConstraintGradient=true,HessianApproximation="lbfgs"); tic [x2,fval2,eflag2,output2] = fmincon(@multirosenbrock,x0,[],[],[],[],[],[],@cons,optslbfgs)
Local minimum found that satisfies the constraints. Optimization completed because the objective function is non-decreasing in feasible directions, to within the value of the optimality tolerance, and constraints are satisfied to within the value of the constraint tolerance. <stopping criteria details>
x2 = 1000×1
0.7864
0.6177
0.7864
0.6177
0.7864
0.6177
0.7864
0.6177
0.7864
0.6177
0.7864
0.6177
0.7864
0.6177
0.7864
⋮
fval2 = 22.8374
eflag2 = 1
output2 = struct with fields:
iterations: 206
funcCount: 287
constrviolation: 0
stepsize: 7.2407e-08
algorithm: 'interior-point'
firstorderopt: 8.0768e-07
cgiterations: 28
message: 'Local minimum found that satisfies the constraints.↵↵Optimization completed because the objective function is non-decreasing in ↵feasible directions, to within the value of the optimality tolerance,↵and constraints are satisfied to within the value of the constraint tolerance.↵↵<stopping criteria details>↵↵Optimization completed: The relative first-order optimality measure, 8.076780e-07,↵is less than options.OptimalityTolerance = 1.000000e-06, and the relative maximum constraint↵violation, 0.000000e+00, is less than options.ConstraintTolerance = 1.000000e-06.'
bestfeasible: [1×1 struct]
tlbfgs = toc
tlbfgs = 0.7399
Compared to the Hessian multiply function, in this case the LBFGS Hessian approximation:
Returns essentially the same point and function value
Takes more iterations
Takes fewer function evaluations
Has a larger final first-order optimality measure
Takes about three times longer to finish
Depending on the problem, these comparison findings can change.