fminunc stopped because it cannot decrease the objective function along the current search direction.

Question

Ryan Rizzo il 16 Apr 2019

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/456670-fminunc-stopped-because-it-cannot-decrease-the-objective-function-along-the-current-search-direction

Risposto: AOULADHADJ Driss il 18 Ott 2020

I am trying to using `fminunc` top obtain the optimal theta in logistic regression, however I keep getting that:

fminunc stopped because it cannot decrease the objective function
along the current search direction.

Searching online, I found that this is usually the result of a gradient error which I am implementing in `logistic_costFunction.m`. I re-checked my work but I cannot spot the root cause.

I am not sure how to solve this issue, any help would be appreciated.

Here is my code

   clear all; close all; clc;
    %% Plotting data
    x1 = linspace(0,3,50);
    mqtrue = 5;
    cqtrue = 30;
    dat1 = mqtrue*x1+5*randn(1,50);
    
    x2 = linspace(7,10,50);
    dat2 = mqtrue*x2 + (cqtrue + 5*randn(1,50));
    
    x = [x1 x2]'; % X
    
    subplot(2,2,1);
    dat = [dat1 dat2]'; % Y
    
    scatter(x1, dat1); hold on;
    scatter(x2, dat2, '*'); hold on;
    classdata = (dat>40);
    
    %% Compute Cost and Gradient
    
    %  Setup the data matrix appropriately, and add ones for the intercept term
    [m, n] = size(x);
    
    % Add intercept term to x and X_test
    x = [ones(m, 1) x];
    
    % Initialize fitting parameters
    initial_theta = zeros(n + 1, 1);
    
    % Compute and display initial cost and gradient
    [cost, grad] = logistic_costFunction(initial_theta, x, dat);
    
    fprintf('Cost at initial theta (zeros): %f\n', cost);
    fprintf('Gradient at initial theta (zeros): \n');
    fprintf(' %f \n', grad);
    
    %% ============= Part 3: Optimizing using fminunc  =============
    %  In this exercise, you will use a built-in function (fminunc) to find the
    %  optimal parameters theta.
    
    %  Set options for fminunc
    options = optimset('GradObj', 'on', 'MaxIter', 400);
    
    %  Run fminunc to obtain the optimal theta
    %  This function will return theta and the cost 
    [theta, cost] = ...
     	fminunc(@(t)(logistic_costFunction(t, x, dat)), initial_theta, options);
  

logistic_costFunction.m

-----------------------

    function [J, grad] = logistic_costFunction(theta, X, y)
    
    % Initialize some useful values
    m = length(y); % number of training examples
    
    grad = zeros(size(theta));
    
    H = sigmoid(X*theta);
    T = y.*log(H) + (1 - y).*log(1 - H);
    J = -1/m*sum(T);
    for i = 1 : m
    	grad = grad + (H(i) - y(i)) * X(i,:)';
    end
    
    grad = 1/m*grad;
    
    end

sigmoid.m

    function g = sigmoid(z)
    % Computes thes sigmoid of z
    
    g = zeros(size(z));
    
    g = 1 ./ (1 + (1 ./ exp(z)));
    
    end

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Wasiq Malik il 20 Lug 2019

1
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/456670-fminunc-stopped-because-it-cannot-decrease-the-objective-function-along-the-current-search-direction#answer_384052

Modificato: Wasiq Malik il 20 Lug 2019

i was having the same issue then i figured out a mistake

i was using exp(z) in my sigmoid function, looks like you made the same mistake

infact the sigmoid funciton is 1/(1+e^-z)

so change your sigmoid func definition to exp(-z)

and everything will work fine regarding fminunc

function g = sigmoid(z)

% Computes thes sigmoid of z

g = 1 ./ (1 + exp(-z));

end

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Raghav Gopal Rao Netrakanti il 8 Nov 2019

Hi,

I have the sigmoid function written correctly, but is still get the same error. :/ Is there anything else i need to change?

Accedi per commentare.

Answer 2

Alan Weiss il 16 Apr 2019

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/456670-fminunc-stopped-because-it-cannot-decrease-the-objective-function-along-the-current-search-direction#answer_370927

Without running your example, I wonder if you could make two little changes to see if things are OK:

Change the initial point to not be all zeros. Random is OK, but you might want to set the seed first to make things reproducible.
Set the CheckGradients option to true (well, since you are using optimset, set the DerivativeCheck option to 'on') to determine if the gradient calculation is OK.

Alan Weiss

MATLAB mathematical toolbox documentation

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Ryan Rizzo il 16 Apr 2019

Modificato: Ryan Rizzo il 16 Apr 2019

Apri in MATLAB Online

Thank you for your insight.

I set:

initial_theta = [0.2;0.2];

and got the same result.

Regarding the second change, I believe as it is in the example code given, the DerivativeCheck option is already set to on.

Ryan

Accedi per commentare.

Answer 3

Matt J il 16 Apr 2019

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/456670-fminunc-stopped-because-it-cannot-decrease-the-objective-function-along-the-current-search-direction#answer_370937

Modificato: Matt J il 16 Apr 2019

You will need to use a dedicated function for computing the log-sigmoid. Combining log and sigmoid as separate functions is numerically unstable. This FEX contribution may be useful, as way of stably computing log(sum(exp(x))

https://www.mathworks.com/matlabcentral/fileexchange/10209-maxstar

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Matt J il 16 Apr 2019

Modificato: Matt J il 16 Apr 2019

Apri in MATLAB Online

Or, try this. Note that you should be using "classdata" where you are currently using "dat".

 [theta, cost, exitflag,stats,grad] = ...
     	fminunc(@(t)(logistic_costFunction(t, x, classdata)), initial_theta, options)
   function [J, grad] = logistic_costFunction(theta, X, y)
    
        Xt = (X*theta);
    
        T = y.*logsigmoid(Xt) + (1 - y).*logsigmoid(-Xt);
        J = -mean(T);
        
        if nargout>1
           grad=(sigmoid(Xt)-y).'*X; 
           grad=grad.'/numel(y);
        end
    
    end
    function g = sigmoid(z)
    % Computes thes sigmoid of z
    
      g = 1 ./ (1 + exp(-z));
    
    end
    
    function y = logsigmoid(z)
    % Computes thes log-sigmoid of z in a numerically stable fashion.
         z=-z;
         idx=z<=33;
         y=z;
         y(idx)=log1p( exp(z(idx)) );
         y=-y;
         
    end