fminunc stopped because it cannot decrease the objective function along the current search direction.
60 visualizzazioni (ultimi 30 giorni)
Mostra commenti meno recenti
I am trying to using `fminunc` top obtain the optimal theta in logistic regression, however I keep getting that:
fminunc stopped because it cannot decrease the objective function
along the current search direction.
Searching online, I found that this is usually the result of a gradient error which I am implementing in `logistic_costFunction.m`. I re-checked my work but I cannot spot the root cause.
I am not sure how to solve this issue, any help would be appreciated.
Here is my code
clear all; close all; clc;
%% Plotting data
x1 = linspace(0,3,50);
mqtrue = 5;
cqtrue = 30;
dat1 = mqtrue*x1+5*randn(1,50);
x2 = linspace(7,10,50);
dat2 = mqtrue*x2 + (cqtrue + 5*randn(1,50));
x = [x1 x2]'; % X
subplot(2,2,1);
dat = [dat1 dat2]'; % Y
scatter(x1, dat1); hold on;
scatter(x2, dat2, '*'); hold on;
classdata = (dat>40);
%% Compute Cost and Gradient
% Setup the data matrix appropriately, and add ones for the intercept term
[m, n] = size(x);
% Add intercept term to x and X_test
x = [ones(m, 1) x];
% Initialize fitting parameters
initial_theta = zeros(n + 1, 1);
% Compute and display initial cost and gradient
[cost, grad] = logistic_costFunction(initial_theta, x, dat);
fprintf('Cost at initial theta (zeros): %f\n', cost);
fprintf('Gradient at initial theta (zeros): \n');
fprintf(' %f \n', grad);
%% ============= Part 3: Optimizing using fminunc =============
% In this exercise, you will use a built-in function (fminunc) to find the
% optimal parameters theta.
% Set options for fminunc
options = optimset('GradObj', 'on', 'MaxIter', 400);
% Run fminunc to obtain the optimal theta
% This function will return theta and the cost
[theta, cost] = ...
fminunc(@(t)(logistic_costFunction(t, x, dat)), initial_theta, options);
logistic_costFunction.m
-----------------------
function [J, grad] = logistic_costFunction(theta, X, y)
% Initialize some useful values
m = length(y); % number of training examples
grad = zeros(size(theta));
H = sigmoid(X*theta);
T = y.*log(H) + (1 - y).*log(1 - H);
J = -1/m*sum(T);
for i = 1 : m
grad = grad + (H(i) - y(i)) * X(i,:)';
end
grad = 1/m*grad;
end
sigmoid.m
function g = sigmoid(z)
% Computes thes sigmoid of z
g = zeros(size(z));
g = 1 ./ (1 + (1 ./ exp(z)));
end
0 Commenti
Risposte (4)
Wasiq Malik
il 20 Lug 2019
Modificato: Wasiq Malik
il 20 Lug 2019
i was having the same issue then i figured out a mistake
i was using exp(z) in my sigmoid function, looks like you made the same mistake
infact the sigmoid funciton is 1/(1+e^-z)
so change your sigmoid func definition to exp(-z)
and everything will work fine regarding fminunc
function g = sigmoid(z)
% Computes thes sigmoid of z
g = 1 ./ (1 + exp(-z));
end
1 Commento
Raghav Gopal Rao Netrakanti
il 8 Nov 2019
Hi,
I have the sigmoid function written correctly, but is still get the same error. :/ Is there anything else i need to change?
Alan Weiss
il 16 Apr 2019
Without running your example, I wonder if you could make two little changes to see if things are OK:
- Change the initial point to not be all zeros. Random is OK, but you might want to set the seed first to make things reproducible.
- Set the CheckGradients option to true (well, since you are using optimset, set the DerivativeCheck option to 'on') to determine if the gradient calculation is OK.
Alan Weiss
MATLAB mathematical toolbox documentation
Matt J
il 16 Apr 2019
Modificato: Matt J
il 16 Apr 2019
You will need to use a dedicated function for computing the log-sigmoid. Combining log and sigmoid as separate functions is numerically unstable. This FEX contribution may be useful, as way of stably computing log(sum(exp(x))
1 Commento
Matt J
il 16 Apr 2019
Modificato: Matt J
il 16 Apr 2019
Or, try this. Note that you should be using "classdata" where you are currently using "dat".
[theta, cost, exitflag,stats,grad] = ...
fminunc(@(t)(logistic_costFunction(t, x, classdata)), initial_theta, options)
function [J, grad] = logistic_costFunction(theta, X, y)
Xt = (X*theta);
T = y.*logsigmoid(Xt) + (1 - y).*logsigmoid(-Xt);
J = -mean(T);
if nargout>1
grad=(sigmoid(Xt)-y).'*X;
grad=grad.'/numel(y);
end
end
function g = sigmoid(z)
% Computes thes sigmoid of z
g = 1 ./ (1 + exp(-z));
end
function y = logsigmoid(z)
% Computes thes log-sigmoid of z in a numerically stable fashion.
z=-z;
idx=z<=33;
y=z;
y(idx)=log1p( exp(z(idx)) );
y=-y;
end
AOULADHADJ Driss
il 18 Ott 2020
use this instead of the last line of the code that contain the fmincun function
fminunc(@(t)(logistic_costFunction(t, x, dat)), initial_theta, options)
ps: without the semicolons
0 Commenti
Vedere anche
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!