fmincon optimizes a non-differentiable function, How?

11 visualizzazioni (ultimi 30 giorni)
I am using sqp solver of fmincon to solve a bound-constrained optimization problem. I found that my objective function is continuous but non-differntiable at a few points (due to sharp corners). Still, fmincon optimized my objective and returned an optimal value as the corner point which has the lowest function value. While this makes sense as the optimal point of the function, my understanding was that fmincon should have thrown an error while trying to optimize a non-differentiable function!
As an example, I tried to optimize a simple continuous, non-differentiable objective function: over bounds , which is simply , non-differntiable at . However, using fmincon with sqp solver, providing different initial guesses , the problem converges with an optimal solution of . The exitflag returned was 1, which indicates that the optimization run stopped when the 1st order optimality measure at the iterate point was less than optimality tolerance ( by default). But I am unable to understand how the optimality criterion gets satisfed given that the function is non-differentiable at and the derivatives on either side of it are either or .
  1 Commento
Walter Roberson
Walter Roberson il 17 Ott 2022
The short summary is that programs are not required to detect and report every case in which the inputs violate the conditions under which the function works.
Detection of violations in numeric routines is generally not possible.
Suppose for example the function divides by the minimum complex component of the non-trivial roots of the zeta function. As far as anyone has been able to tell, the non-trivial roots all have complex component 1/2 (or is it negative 1/2?) But proving that has turned out to be exceedingly difficult. There is no realistic chance that any numeric optimization routine could implement a foolproof theorem prover to determine whether the function meets the mathematical requirements.

Accedi per commentare.

Risposta accettata

Matt J
Matt J il 16 Ott 2022
Modificato: Matt J il 16 Ott 2022
But I am unable to understand how the optimality criterion gets satisfed given that the function is non-differentiable at and the derivatives on either side of it are either -1 or +1
The derivatives are approximated by default with central finite differences which, because of the symmetry of the function abs(x), achieves low values around the minimum. With a more asymmetric example, like below, you can see that we cannot approach the minimum as closely with central finite differences as the tolerances request.
fun=@(x)5*x.*(x>=0)-x.*(x<0);
fplot(fun,[-1,1]);
opts=optimoptions(@fmincon,'Algorithm','sqp','FiniteDifferenceType','central',...
'OptimalityTol',1e-16,'FunctionTol',1e-16,'StepTol',1e-16);
[x,fval, exitflag]=fmincon(fun,-0.2,[],[],[],[],[],[],[],opts)
Solver stopped prematurely. fmincon stopped because it exceeded the function evaluation limit, options.MaxFunctionEvaluations = 1.000000e+02.
x = -6.0549e-06
fval = 6.0549e-06
exitflag = 0
  2 Commenti
Matt J
Matt J il 16 Ott 2022
But note that the behavior depends a lot on the algorithm and finite difference type as well. Below, we can see that convergence improves greatly with the default interior-point algorithm and using 'FiniteDifferenceType'='forward':
fun=@(x)5*x.*(x>=0)-x.*(x<0);
opts=optimoptions(@fmincon,'FiniteDifferenceType','forward',...
'OptimalityTol',1e-16,'FunctionTol',1e-16,'StepTol',1e-16);
[x,fval, exitflag]=fmincon(fun,-0.2,[],[],[],[],[],[],[],opts)
Local minimum possible. Constraints satisfied. fmincon stopped because the size of the current step is less than the value of the step size tolerance and constraints are satisfied to within the value of the constraint tolerance.
x = -1.2003e-17
fval = 1.2003e-17
exitflag = 2
Maaz Ahmad
Maaz Ahmad il 17 Ott 2022
@Matt J thanks for the explanation. I understand what's happening due to the type of function profile and finite differencing for gradient estimation, and it becomes clearer by @John D'Errico's answer as well that the iterate keeps fluctuating on either side of the corner point until it converges when it gets closer than finite difference step size such that gradient estimate gets nearly 0 (or due to step tolerance as shown by his example). 1 small correction to the answer - the default differencing method is forward and not central.

Accedi per commentare.

Più risposte (1)

John D'Errico
John D'Errico il 16 Ott 2022
Modificato: John D'Errico il 16 Ott 2022
No. It does NOT throw an error, though maybe sometimes a warning. It may often even survive as it did here. fmincon does not automatically know when a function is non-differentiable. It cannot know that, unless it happens to trip over such a problem point, where it decides something strange could be happening. Of course a discontinuity is worse than a point of non-differentiability.
Anyway, fmincon never LOOKS for points of non-differentiability. Identifying such a point takes effort that is not worth the time invested or the programming effort.
In the case of your test function, consider the history of points it evaluates the function at.
format long g
format compact
[x,f] = fmincon(@testobj,.2,[],[],[],[],0,1)
0.2 0.3 0.200000014901161 0.299999985098839 0.566306728082761 0.0663067280827607 0.566306742983922 0.0663067429839219 0.389195527131688 0.110804472868312 0.477751127607224 0.0222488723927757 0.477751142508385 0.0222488574916145 0.521787985350157 0.0217879853501575 0.521788000251319 0.0217880002513187 0.499780926040705 0.000219073959295102 0.499780940941866 0.000219059058133908 0.510690412709014 0.0106904127090141 0.50523566937486 0.00523566937485953 0.502508297707782 0.00250829770778216 0.501144611874244 0.00114461187424353 0.510690412709014 0.0106904127090141 0.50523566937486 0.00523566937485953 0.502508297707782 0.00250829770778216 0.501144611874244 0.00114461187424353 0.500462768957474 0.000462768957474213 0.50012184749909 0.000121847499089611 0.500121862400251 0.000121862400250805 0.499951419201842 4.85807981575048e-05 0.499951434103004 4.8565896996311e-05 0.500036630851094 3.66308510942881e-05 0.500036645752255 3.66457522554819e-05 0.499994025230116 5.97476988373202e-06 0.499994040131277 5.95986872253818e-06 0.500015327779386 1.532777938551e-05 0.500004676504751 4.67650475088899e-06 0.500004691405912 4.69140591208284e-06 0.499999350870199 6.49129800800452e-07 0.49999936577136 6.34228639606604e-07 0.500002013683185 2.01368318542006e-06 0.500000682276692 6.82276692254291e-07 0.500000016573446 1.65734457269195e-08 0.500000031474607 3.14746069207672e-08 0.499999683721907 3.1627809332635e-07 0.499999850147676 1.49852323827471e-07 0.499999933360561 6.663943902252e-08 0.499999974967003 2.50329966755558e-08 0.499999683721907 3.1627809332635e-07 0.499999850147676 1.49852323827471e-07 0.499999933360561 6.663943902252e-08 0.499999974967003 2.50329966755558e-08 0.499999995770225 4.22977547431813e-09 0.500000010671386 1.06713857750407e-08 0.499999979929403 2.00705974617854e-08 0.499999987849814 1.21501864680518e-08 0.499999991810019 8.18998097118495e-09 0.499999993790122 6.20987822275154e-09 0.499999979929403 2.00705974617854e-08 0.499999987849814 1.21501864680518e-08 0.499999991810019 8.18998097118495e-09 0.499999993790122 6.20987822275154e-09 0.499999994780173 5.21982684853484e-09 0.499999995275199 4.72480116142648e-09 0.499999995522712 4.47728831787231e-09 0.499999995646468 4.35353186833964e-09 0.499999995708346 4.29165369908446e-09 Local minimum possible. Constraints satisfied. fmincon stopped because the size of the current step is less than the value of the step size tolerance and constraints are satisfied to within the value of the constraint tolerance.
x = 0.499999995770225
f = 4.22977547431813e-09
function z = testobj(x)
z = sqrt((x-.5).^2);
disp([x,z])
end
So you see it generate one point, then use a finite difference to compute a derivative. Then it takes a step. If it has gone too far, it generates another derivative estimate, bounces backwards. In fact, it appears to be bouncing back and forth across the point of non-differentiability. This makes for slow convergence of course, BECAUSE it assumes the function is smooth and well-behaved. Typically we would expect quadratic convergence near a solution. That fails to happen of course here.
Does this mean that fmincon would be expected to work always, even on such a problem? Of course not. It can far too easily get stuck. With some time invested, I could probably come up with some function that causes it to oscillate infinitely between two points.
The point is, fmincon is designed to handle a specific class of problem. That is, continuously differentiable in the domain of the function. When that fails, fmincon can fail, sometimes in strange ways, and it usually does only when convergence is most important to someone. Conversely, the gods of computing insure that it survives on non-important problems as you have given here.
In the cases where you KNOW you have a non-differentiable objective, then you really want to use tools that are not sensitive to such issues. For example, use fminsearch instead, which does not care as much about the smoothness of your function (though smoothness will probably help it to converge better.) Or for more nasty problems yet, use a tool like GA, which gives even less of a hoot about what you throw at it.
  1 Commento
Maaz Ahmad
Maaz Ahmad il 17 Ott 2022
Understand thanks a lot. Not to disagree but as a side comment, unfortunately sometimes its difficult to comment on the differentiability of complex objectives to be optimized, and there are issues with heuristics solvers in terms of their accuracy and budget.

Accedi per commentare.

Prodotti


Release

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by