Azzera filtri
Azzera filtri

Why are my GLM Model Beta estimates unbounded and iteration limit reached

5 visualizzazioni (ultimi 30 giorni)
I'm trying to run a logistic regression on bank default predictors using 'fitglm' and 'stepwiseglm'.
I have ten independent variables and 104385 data points
I am encountering 'iteration limit reached' when running my 'stepwiseglm' model.
Warning: Iteration limit reached.
> In glmfit (line 324)
In GeneralizedLinearModel/fitter (line 568)
In classreg.regr.FitObject/doFit (line 94)
In classreg.regr.TermsRegression>reFit (line 870)
In classreg.regr.TermsRegression/stepwiseFitter (line 330)
In GeneralizedLinearModel.stepwise (line 1011)
In stepwiseglm (line 148)
I believe I have more than enough data
I have looked at the correlation between independent variables and removed variables > 0.5 correlation.
I haven't included a column of ones by accident as they're already included in the function.
% This is the model I am using
mdl = fitglm(X,Y,'Distribution','binomial','Link','logit'); % Logisitic Regression all Variables
tbl = devianceTest(mdl); % Deviance Test
stats = table2array(mdl.Coefficients); % Coefficient & S.E & T Ratio and P Value
What could be causing my data to be perfectly separated and how can I compensate for this.
My beta estimates are therefore unbounded for example: Intercept: 6.68013787665097e+15 & X1: -5.70944870698614e+15 etc.
I have attached my .m script and data file.
DISCLAIMER: This is for a university project so suggestions are welcome if nobody wishes to provide definitive answers
Many Thanks,

Risposte (2)

Daniel Groves
Daniel Groves il 27 Lug 2017
I have solved the problem: 1) The mean of the dichotomous variable was not significantly different from 0 ie: I had 104,000 data points equal 0 (no default) but only 255 equal to 1 (default)
2) Taking a sample of no default banks fixed the problem.
3) Minimum required mean to avoid issue was 0.3 ie: around 1000 data points vs 255 data points.
4) Perfect separation also occurred when i split the data according to size and still inlcuded the variable size as a independent variable.
I hope this helps anyone struggling with logistic regressions. More often than not it is the data that has the problem. Test for outliers and take samples if my best suggestion.

Kaja Horvat
Kaja Horvat il 17 Mag 2018
Hey! I am currently having the same problem... Did you find a way to use the whole dataset, not only a part of the observations which had a zero? Thank you!

Prodotti

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by