Contenuto principale

Hessian Multiply Function for fmincon interior-point Algorithm

This example shows how to use a Hessian multiply function with the fmincon "interior-point" algorithm. A Hessian multiply function can save memory and time by giving the result of multiplying the Hessian of the Jacobian with an arbitrary vector but without generating the full Hessian. The syntax of a Hessian multiply function is

W = HessMultFcn(x,lambda,v)

x is the current point, lambda is the Lagrange multiplier structure, and v is a vector that fmincon passes to HessMultFcn. The output W that the function calculates should be H*v, where H is the Hessian of the Lagrangian:

xx2L(x,λ)=2f(x)+iλineq(i)2ineqnonlini(x)+iλeq(i)2eqnonlini(x).

Here, f(x) is the objective function, ineqnonlin(x) is the nonlinear inequality constraint function that is nonpositive when the constraints are satisfied, and eqnonlin(x) is the nonlinear equality constraint function that is zero when the constraints are satisfied.

Objective and Constraint Functions

Use a multidimensional version of the Rosenbrock function for an even number of dimensions n:

f(x)=i=1n/2(10(x2i-(x2i-12)))2+(1-x2i-1)2.

The gradient of f is

fx2i-1=-400x2i-1x2i+400x2i-13

fx2i=200(x2i-x2i-12)fori=1ton/2.

The nonlinear constraint is that the sum of the squares of x is no more than n/2:

ineqnonlin(x)=i=1nxi2-n/2.

The gradient of this function is 2x.

Express these functions as MATLAB programs.

function [f,g] = multirosenbrock(x)
% Get the problem size
n = length(x);  
if mod(n,2) ~= 0
    error('Input vector, x ,must have an even number of components.');
end
% Evaluate the vector function
odds  = (1:2:n)';
evens = (2:2:n)';
F = zeros(n,1);
F(odds,1)  = 1-x(odds);
F(evens,1) = 10.*(x(evens)-x(odds).^2); 
f = sum(F.^2);
if nargout >= 2 % Calculate gradient
    g = zeros(n,1);
    g(evens) = 200*(x(evens)-x(odds).^2);
    g(odds) = -2*(1 - x(odds)) - 400.*(x(evens)-x(odds).^2).*x(odds);
end
end

function [ineqnonlin,eqnonlin,gradineq,gradeq] = cons(x)
eqnonlin = [];
N = length(x);
ineqnonlin = sum(x.^2) - N/2;
if nargout > 2
    gradeq = [];
    gradineq = 2*x;
end
end

Hessian and Hessian Multiply Function

The Hessian of the objective function f is a block diagonal matrix with entries

2f(x)=[2-400x2+1200x12-400x1-400x12002-400x4+1200x32-400x3-400x32002-400xn+1200xn-12-400xn-1-400xn-1200].

The Hessian 2ineqnonlin(x) is twice the identity matrix.

Write a function HessMultFcn that returns the result 2f(x)*v+2*lambda.ineqnonlin*v for a vector v.

function W = HessMultFcn(x,lambda,v)
v = v(:);
n = length(x);
x = x(:);
if mod(n,2) ~= 0
    error('Input vector, x ,must have an even number of components.');
end

m = n/2;
j = (1:m)';
O = 2*j-1;
E = 2*j;

tmp1 = 2 - 400*x(E) + 1200*x(O).^2;
tmp2 = -400*x(O);
HessfTimesV = zeros(n, 1);
HessfTimesV(O) = tmp1.*v(O) + tmp2.*v(E);
HessfTimesV(E) = tmp2.*v(O) + 200*v(E);

% Return Hessian times v
W = HessfTimesV + 2*lambda.ineqnonlin*v;
end

Difficulty Without Setting Options

The default fmincon options cause the solver to fail to solve this large problem.

Set the number of variables n to 1000.

n = 1000;

Create a pseudorandom start point x0 with n components.

rng default % For reproducibility
x0 = randn(n,1);

Solve the problem using the default options.

[x1,fval1,eflag1,output1] = fmincon(@multirosenbrock,x0,[],[],[],[],[],[],@cons)
Solver stopped prematurely.

fmincon stopped because it exceeded the function evaluation limit,
options.MaxFunctionEvaluations = 3.000000e+03.
x1 = 1000×1

    0.9958
    1.9081
   -0.2722
    1.5397
    0.2169
   -1.2915
   -0.5438
    0.3822
   -4.4181
    4.7854
   -2.0833
    3.4741
    0.7097
    0.0370
    0.6668
      ⋮

fval1 = 
6.3932e+04
eflag1 = 
0
output1 = struct with fields:
         iterations: 2
          funcCount: 3010
    constrviolation: 312.2817
           stepsize: 7.9055
          algorithm: 'interior-point'
      firstorderopt: 2.6050e+04
       cgiterations: 2
            message: 'Solver stopped prematurely.↵↵fmincon stopped because it exceeded the function evaluation limit,↵options.MaxFunctionEvaluations = 3.000000e+03.'
       bestfeasible: []

To attempt to get a better solution, set the SpecifyObjectiveGradient and SpecifyConstraintGradient options to true and try again. Time the solver.

opts1 = optimoptions("fmincon",...
    SpecifyObjectiveGradient=true,...
    SpecifyConstraintGradient=true);
tic
[x1,fval1,eflag1,output1] = fmincon(@multirosenbrock,x0,[],[],[],[],[],[],@cons,opts1)
Solver stopped prematurely.

fmincon stopped because it exceeded the function evaluation limit,
options.MaxFunctionEvaluations = 3.000000e+03.
x1 = 1000×1

    0.7890
    0.6218
    0.7836
    0.6132
    0.7863
    0.6175
    0.7863
    0.6174
    0.7848
    0.6152
    0.7847
    0.6151
    0.7866
    0.6179
    0.7866
      ⋮

fval1 = 
22.8431
eflag1 = 
0
output1 = struct with fields:
         iterations: 651
          funcCount: 3000
    constrviolation: 0
           stepsize: 0.0148
          algorithm: 'interior-point'
      firstorderopt: 0.2428
       cgiterations: 6657
            message: 'Solver stopped prematurely.↵↵fmincon stopped because it exceeded the function evaluation limit,↵options.MaxFunctionEvaluations = 3.000000e+03.'
       bestfeasible: [1×1 struct]

toc
Elapsed time is 75.380747 seconds.

The solver does not reach an accurate solution, and takes a great deal of time.

Solve Nonlinear Problem With Constraint Using Hessian Multiply Function

To solve a problem using a Hessian multiply function, you must use appropriate options.

options = optimoptions("fmincon",...
    Algorithm="interior-point",...
    SpecifyObjectiveGradient=true,...
    SpecifyConstraintGradient=true,...
    SubproblemAlgorithm="cg",...
    HessianMultiplyFcn=@HessMultFcn);

Minimize the multirosenbrock function subject to the nonlinear constraint.

tic
[x,fval,eflag,output] = fmincon(@multirosenbrock,x0,[],[],[],[],[],[],@cons,options)
Local minimum found that satisfies the constraints.

Optimization completed because the objective function is non-decreasing in 
feasible directions, to within the value of the optimality tolerance,
and constraints are satisfied to within the value of the constraint tolerance.

<stopping criteria details>
x = 1000×1

    0.7864
    0.6177
    0.7864
    0.6177
    0.7864
    0.6177
    0.7864
    0.6177
    0.7864
    0.6177
    0.7864
    0.6177
    0.7864
    0.6177
    0.7864
      ⋮

fval = 
22.8374
eflag = 
1
output = struct with fields:
         iterations: 28
          funcCount: 870
    constrviolation: 0
           stepsize: 3.8640e-07
          algorithm: 'interior-point'
      firstorderopt: 2.0000e-08
       cgiterations: 788
            message: 'Local minimum found that satisfies the constraints.↵↵Optimization completed because the objective function is non-decreasing in ↵feasible directions, to within the value of the optimality tolerance,↵and constraints are satisfied to within the value of the constraint tolerance.↵↵<stopping criteria details>↵↵Optimization completed: The relative first-order optimality measure, 2.000017e-08,↵is less than options.OptimalityTolerance = 1.000000e-06, and the relative maximum constraint↵violation, 0.000000e+00, is less than options.ConstraintTolerance = 1.000000e-06.'
       bestfeasible: [1×1 struct]

tmult = toc
tmult = 
0.2230

fmincon solves the 1000 variable problem using about 30 iterations and 900 function evaluations.

Compare to LBFGS Hessian

The fmincon "lbfgs" Hessian approximation provides an alternative approach to saving memory that does not require you to create a Hessian multiply function.

optslbfgs = optimoptions("fmincon",SpecifyObjectiveGradient=true,...
    SpecifyConstraintGradient=true,HessianApproximation="lbfgs");
tic
[x2,fval2,eflag2,output2] = fmincon(@multirosenbrock,x0,[],[],[],[],[],[],@cons,optslbfgs)
Local minimum found that satisfies the constraints.

Optimization completed because the objective function is non-decreasing in 
feasible directions, to within the value of the optimality tolerance,
and constraints are satisfied to within the value of the constraint tolerance.

<stopping criteria details>
x2 = 1000×1

    0.7864
    0.6177
    0.7864
    0.6177
    0.7864
    0.6177
    0.7864
    0.6177
    0.7864
    0.6177
    0.7864
    0.6177
    0.7864
    0.6177
    0.7864
      ⋮

fval2 = 
22.8374
eflag2 = 
1
output2 = struct with fields:
         iterations: 206
          funcCount: 287
    constrviolation: 0
           stepsize: 7.2407e-08
          algorithm: 'interior-point'
      firstorderopt: 8.0768e-07
       cgiterations: 28
            message: 'Local minimum found that satisfies the constraints.↵↵Optimization completed because the objective function is non-decreasing in ↵feasible directions, to within the value of the optimality tolerance,↵and constraints are satisfied to within the value of the constraint tolerance.↵↵<stopping criteria details>↵↵Optimization completed: The relative first-order optimality measure, 8.076780e-07,↵is less than options.OptimalityTolerance = 1.000000e-06, and the relative maximum constraint↵violation, 0.000000e+00, is less than options.ConstraintTolerance = 1.000000e-06.'
       bestfeasible: [1×1 struct]

tlbfgs = toc
tlbfgs = 
0.7399

Compared to the Hessian multiply function, in this case the LBFGS Hessian approximation:

  • Returns essentially the same point and function value

  • Takes more iterations

  • Takes fewer function evaluations

  • Has a larger final first-order optimality measure

  • Takes about three times longer to finish

Depending on the problem, these comparison findings can change.

See Also

Topics