fminunc stopped because it cannot decrease the objective function along the current search direction.

60 visualizzazioni (ultimi 30 giorni)
I am trying to using `fminunc` top obtain the optimal theta in logistic regression, however I keep getting that:
fminunc stopped because it cannot decrease the objective function
along the current search direction.
Searching online, I found that this is usually the result of a gradient error which I am implementing in `logistic_costFunction.m`. I re-checked my work but I cannot spot the root cause.
I am not sure how to solve this issue, any help would be appreciated.
Here is my code
clear all; close all; clc;
%% Plotting data
x1 = linspace(0,3,50);
mqtrue = 5;
cqtrue = 30;
dat1 = mqtrue*x1+5*randn(1,50);
x2 = linspace(7,10,50);
dat2 = mqtrue*x2 + (cqtrue + 5*randn(1,50));
x = [x1 x2]'; % X
subplot(2,2,1);
dat = [dat1 dat2]'; % Y
scatter(x1, dat1); hold on;
scatter(x2, dat2, '*'); hold on;
classdata = (dat>40);
%% Compute Cost and Gradient
% Setup the data matrix appropriately, and add ones for the intercept term
[m, n] = size(x);
% Add intercept term to x and X_test
x = [ones(m, 1) x];
% Initialize fitting parameters
initial_theta = zeros(n + 1, 1);
% Compute and display initial cost and gradient
[cost, grad] = logistic_costFunction(initial_theta, x, dat);
fprintf('Cost at initial theta (zeros): %f\n', cost);
fprintf('Gradient at initial theta (zeros): \n');
fprintf(' %f \n', grad);
%% ============= Part 3: Optimizing using fminunc =============
% In this exercise, you will use a built-in function (fminunc) to find the
% optimal parameters theta.
% Set options for fminunc
options = optimset('GradObj', 'on', 'MaxIter', 400);
% Run fminunc to obtain the optimal theta
% This function will return theta and the cost
[theta, cost] = ...
fminunc(@(t)(logistic_costFunction(t, x, dat)), initial_theta, options);
logistic_costFunction.m
-----------------------
function [J, grad] = logistic_costFunction(theta, X, y)
% Initialize some useful values
m = length(y); % number of training examples
grad = zeros(size(theta));
H = sigmoid(X*theta);
T = y.*log(H) + (1 - y).*log(1 - H);
J = -1/m*sum(T);
for i = 1 : m
grad = grad + (H(i) - y(i)) * X(i,:)';
end
grad = 1/m*grad;
end
sigmoid.m
function g = sigmoid(z)
% Computes thes sigmoid of z
g = zeros(size(z));
g = 1 ./ (1 + (1 ./ exp(z)));
end

Risposte (4)

Wasiq Malik
Wasiq Malik il 20 Lug 2019
Modificato: Wasiq Malik il 20 Lug 2019
i was having the same issue then i figured out a mistake
i was using exp(z) in my sigmoid function, looks like you made the same mistake
infact the sigmoid funciton is 1/(1+e^-z)
so change your sigmoid func definition to exp(-z)
and everything will work fine regarding fminunc
function g = sigmoid(z)
% Computes thes sigmoid of z
g = 1 ./ (1 + exp(-z));
end

Alan Weiss
Alan Weiss il 16 Apr 2019
Without running your example, I wonder if you could make two little changes to see if things are OK:
  1. Change the initial point to not be all zeros. Random is OK, but you might want to set the seed first to make things reproducible.
  2. Set the CheckGradients option to true (well, since you are using optimset, set the DerivativeCheck option to 'on') to determine if the gradient calculation is OK.
Alan Weiss
MATLAB mathematical toolbox documentation
  1 Commento
Ryan Rizzo
Ryan Rizzo il 16 Apr 2019
Modificato: Ryan Rizzo il 16 Apr 2019
Thank you for your insight.
I set:
initial_theta = [0.2;0.2];
and got the same result.
Regarding the second change, I believe as it is in the example code given, the DerivativeCheck option is already set to on.
Ryan

Accedi per commentare.


Matt J
Matt J il 16 Apr 2019
Modificato: Matt J il 16 Apr 2019
You will need to use a dedicated function for computing the log-sigmoid. Combining log and sigmoid as separate functions is numerically unstable. This FEX contribution may be useful, as way of stably computing log(sum(exp(x))
  1 Commento
Matt J
Matt J il 16 Apr 2019
Modificato: Matt J il 16 Apr 2019
Or, try this. Note that you should be using "classdata" where you are currently using "dat".
[theta, cost, exitflag,stats,grad] = ...
fminunc(@(t)(logistic_costFunction(t, x, classdata)), initial_theta, options)
function [J, grad] = logistic_costFunction(theta, X, y)
Xt = (X*theta);
T = y.*logsigmoid(Xt) + (1 - y).*logsigmoid(-Xt);
J = -mean(T);
if nargout>1
grad=(sigmoid(Xt)-y).'*X;
grad=grad.'/numel(y);
end
end
function g = sigmoid(z)
% Computes thes sigmoid of z
g = 1 ./ (1 + exp(-z));
end
function y = logsigmoid(z)
% Computes thes log-sigmoid of z in a numerically stable fashion.
z=-z;
idx=z<=33;
y=z;
y(idx)=log1p( exp(z(idx)) );
y=-y;
end

Accedi per commentare.


AOULADHADJ Driss
AOULADHADJ Driss il 18 Ott 2020
use this instead of the last line of the code that contain the fmincun function
fminunc(@(t)(logistic_costFunction(t, x, dat)), initial_theta, options)
ps: without the semicolons

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by