# Minimization with Dense Structured Hessian, Linear Equalities

### Hessian Multiply Function for Lower Memory

The `fmincon` `interior-point` and `trust-region-reflective` algorithms, and the `fminunc` `trust-region` algorithm, can solve problems where the Hessian is dense but structured. For these problems, `fmincon` and `fminunc` do not compute `H*Y` with the Hessian `H` directly, because forming `H` would be memory-intensive. Instead, you must provide `fmincon` or `fminunc` with a function that, given a matrix `Y` and information about `H`, computes `W = H*Y`.

In this example, the objective function is nonlinear and linear equalities exist so `fmincon` is used. The description applies to the `trust-region reflective` algorithm; the `fminunc` `trust-region` algorithm is similar. For the `interior-point` algorithm, see the `HessianMultiplyFcn` option in Hessian Multiply Function. The objective function has the structure

`$f\left(x\right)=\underset{}{\overset{ˆ}{f}}\left(x\right)-\frac{1}{2}{x}^{T}V{V}^{T}x,$`

where $V$ is a 1000-by-2 matrix. The Hessian of $f$ is dense, but the Hessian of $\underset{}{\overset{ˆ}{f}}$ is sparse. If the Hessian of $\underset{}{\overset{ˆ}{f}}$ is $\underset{}{\overset{ˆ}{H}}$, then $H$, the Hessian of $f$, is

`$H=\underset{}{\overset{ˆ}{H}}-V{V}^{T}.$`

To avoid excessive memory usage that could happen by working with $H$ directly, the example provides a Hessian multiply function, `hmfleq1`. This function, when passed a matrix `Y`, uses sparse matrices `Hinfo`, which corresponds to $\underset{}{\overset{ˆ}{H}}$, and `V` to compute the Hessian matrix product

`W = H*Y = (Hinfo - V*V')*Y.`

In this example, the Hessian multiply function needs $\underset{}{\overset{ˆ}{H}}$, and `V` to compute the Hessian matrix product. `V` is a constant, so you can capture `V` in a function handle to an anonymous function.

However, $\underset{}{\overset{ˆ}{H}}$ is not a constant and must be computed at the current `x`. You can do this by computing $\underset{}{\overset{ˆ}{H}}$ in the objective function and returning $\underset{}{\overset{ˆ}{H}}$ as `Hinfo` in the third output argument. By using `optimoptions` to set the `HessianMultiplyFcn` option to a handle to the `hmfleq1` Hessian multiply function listed here, `fmincon` knows to get the `Hinfo` value from the objective function and pass it to the Hessian multiply function `hmfleq1`.

#### Step 1: Write a file brownvv.m that computes the objective function, the gradient, and the sparse part of the Hessian.

The example passes `brownvv` to `fmincon` as the objective function. The `brownvv.m` file is long and is not listed here. You can view the code with the command

```type brownvv ```

Because `brownvv` computes the gradient as well as the objective function, the example (Step 3) uses `optimoptions` to set the `SpecifyObjectiveGradient` option to `true`.

#### Step 2: Write a function to compute Hessian-matrix products for H given a matrix Y.

Now, define a function `hmfleq1` that uses `Hinfo`, which is computed in `brownvv`, and `V`, which you can capture in a function handle to an anonymous function, to compute the Hessian matrix product `W` where `W = H*Y = (Hinfo - V*V')*Y`. This function must have the form

`W = hmfleq1(Hinfo,Y)`

The first argument must be the same as the third argument returned by the objective function `brownvv`. The second argument to the Hessian multiply function is the matrix `Y` (of `W = H*Y`).

Because `fmincon` expects the second argument `Y` to be used to form the Hessian matrix product, `Y` is always a matrix with `n` rows, where `n` is the number of dimensions in the problem. The number of columns in `Y` can vary. Finally, you can use a function handle to an anonymous function to capture `V`, so `V` can be the third argument to `hmfleq1`. This technique is described in more detail in Passing Extra Parameters. Here is a listing of the file `hmfleq1.m`.

`type hmfleq1.m`
```function W = hmfleq1(Hinfo,Y,V) %HMFLEQ1 Hessian-matrix product function for BROWNVV objective. % Documentation example. % W = hmfbx4(Hinfo,Y,V) computes W = (Hinfo-V*V')*Y % where Hinfo is a sparse matrix computed by BROWNVV % and V is a 2 column matrix. % Copyright 1984-2008 The MathWorks, Inc. W = Hinfo*Y - V*(V'*Y); ```

#### Step 3: Call a nonlinear minimization routine with a starting point and linear equality constraints.

Load the problem parameter, `V`, and the sparse equality constraint matrices, `Aeq` and `beq`, from `fleq1.mat`, which is available when you run this example. Use `optimoptions` to set the `SpecifyObjectiveGradient` option to `true` and to set the `HessianMultiplyFcn` option to a function handle that points to `hmfleq1`. Call `fmincon` with objective function `brownvv` and with `V` as an additional parameter. View the code that runs this procedure:

`type runfleq1.m`
```function [fval,exitflag,output,x] = runfleq1 % RUNFLEQ1 demonstrates 'HessMult' option for FMINCON with linear % equalities. problem = load('fleq1'); % Get V, Aeq, beq V = problem.V; Aeq = problem.Aeq; beq = problem.beq; n = 1000; % problem dimension xstart = -ones(n,1); xstart(2:2:n,1) = ones(length(2:2:n),1); % starting point options = optimoptions(@fmincon,... 'Algorithm','trust-region-reflective',... 'SpecifyObjectiveGradient',true, ... 'HessianMultiplyFcn',@(Hinfo,Y)hmfleq1(Hinfo,Y,V),... 'Display','iter',... 'OptimalityTolerance',1e-9,... 'FunctionTolerance',1e-9); [x,fval,exitflag,output] = fmincon(@(x)brownvv(x,V),xstart,[],[],Aeq,beq,[],[], ... [],options); ```

To run this code, enter

`[fval,exitflag,output,x] = runfleq1;`
``` Norm of First-order Iteration f(x) step optimality CG-iterations 0 2297.63 1.41e+03 1 1084.59 6.3903 578 1 2 1084.59 100 578 3 3 1084.59 25 578 0 4 1084.59 6.25 578 0 5 1047.61 1.5625 240 0 6 761.592 3.125 62.4 2 7 761.592 6.25 62.4 4 8 746.478 1.5625 163 0 9 546.578 3.125 84.1 2 10 274.311 6.25 26.9 2 11 55.6193 11.6597 40 2 12 55.6193 25 40 3 13 22.2964 6.25 26.3 0 14 -49.516 6.25 78 1 15 -93.2772 1.5625 68 1 16 -207.204 3.125 86.5 1 17 -434.162 6.25 70.7 1 18 -681.359 6.25 43.7 2 19 -681.359 6.25 43.7 4 20 -698.041 1.5625 191 0 21 -723.959 3.125 256 7 22 -751.33 0.78125 154 3 23 -793.974 1.5625 24.4 3 24 -820.831 2.51937 6.11 3 25 -823.069 0.562132 2.87 3 26 -823.237 0.196753 0.486 3 27 -823.245 0.0621202 0.386 3 28 -823.246 0.0199951 0.11 6 29 -823.246 0.00731333 0.0404 7 30 -823.246 0.00505883 0.0185 8 31 -823.246 0.00126471 0.00268 9 32 -823.246 0.00149326 0.00521 9 33 -823.246 0.000373314 0.00091 9 Local minimum possible. fmincon stopped because the final change in function value relative to its initial value is less than the value of the function tolerance. ```

Convergence is rapid for a problem of this size with the PCG iteration cost increasing modestly as the optimization progresses. Feasibility of the equality constraints is maintained at the solution.

```problem = load('fleq1'); % Get V, Aeq, beq V = problem.V; Aeq = problem.Aeq; beq = problem.beq; norm(Aeq*x-beq,inf)```
```ans = 2.3093e-14 ```

### Preconditioning

In this example, `fmincon` cannot use `H` to compute a preconditioner because `H` only exists implicitly. Instead of `H`, `fmincon` uses `Hinfo`, the third argument returned by `brownvv`, to compute a preconditioner. `Hinfo` is a good choice because it is the same size as `H` and approximates `H` to some degree. If `Hinfo` were not the same size as `H`, `fmincon` would compute a preconditioner based on some diagonal scaling matrices determined from the algorithm. Typically, this would not perform as well.