Dealing with the glmfit Warning: The estimated coefficients perfectly separate failures from successes. This means the theoretical best estimates are not finite.
Mostra commenti meno recenti
My data has 22 variables and 76 observations, 44 of which are "positive", 32 "negative". I'm interested in computing 95% confidence intervals of the logistic regression model coefficients. However, running
fitglm(data, 'Distribution', 'binomial');
throws the following warning:
Warning: Iteration limit reached.
> In glmfit (line 340)
In GeneralizedLinearModel/fitter (line 659)
In classreg.regr/FitObject/doFit (line 94)
In GeneralizedLinearModel.fit (line 973)
In fitglm (line 146)
In logRegOTRKO (line 2)
Warning: The estimated coefficients perfectly separate failures from successes. This means the theoretical best estimates are not finite. For the fitted
linear combination XB of the predictors, the sample proportions P of Y=N in the data satisfy:
XB<1.12299: P=0
XB>1.12299: P=1
> In glmfit>diagnoseSeparation (line 560)
In glmfit (line 346)
In GeneralizedLinearModel/fitter (line 659)
In classreg.regr/FitObject/doFit (line 94)
In GeneralizedLinearModel.fit (line 973)
In fitglm (line 146)
In logRegOTRKO (line 2)
and the resulting coefficients have SE's that are about an order of magnitude larger than the coefficients themselves, and p-values close to one, although I know that many of the independent variables are significantly different between postivie and negative classes.
7 Commenti
Walter Roberson
il 5 Apr 2022
First thing I would try is creating
opt = struct('MaxIter', 1000);
fitglm(data, 'Distribution', 'binomial', 'options', opt);
Jeff Miller
il 6 Apr 2022
One aspect of the problem is probably that you have relatively many predictors (22) for the number of cases (76). You usually need more like 10 or 20 times as many cases as variables.
This page gives a description of the basic problem, which is that the logistic regression model parameter estimates (along with their SEs) drift towards infinity when the outcome variable can be predicted perfectly (which is more likely to happen when the ratio of cases to predictors is low). You could try excluding predictors or using something like PCA to condense the predictor set, but it is also possible that you just can't answer the questions you are interested in without a much larger sample.
asaf benjamin
il 6 Apr 2022
Modificato: asaf benjamin
il 6 Apr 2022
Walter Roberson
il 6 Apr 2022
Did increasing MaxIter not solve the warning about iteration limit reached ? Did you try with something like 1e6 iteration limit just to see what would happen?
asaf benjamin
il 6 Apr 2022
Daniel K
il 18 Lug 2023
I'm getting the same error right now, but I don't really understand what the warning
"Warning: The estimated coefficients perfectly separate failures from successes."
means. is there an more understandable explanation anywhere?
Walter Roberson
il 18 Lug 2023
The message about perfect separation means that there is no noise and the data can be exactly fit by a model with the given number of predictors. When you have a relatively high number of predictors compared to the sample size, it becomes more likely that a simple model can exactly predict the data.
Now suppose you had a goodness measure that involved dividing by the number of values not exactly predicted, but that the number not exactly fit by the model was 0, then you would in that case be calculating something divided by 0, which would not give a finite result.
You probably either need a lot more data, or else need a simpler model (fewer predictors) so that the predictions are no longer exact.
... but from time to time the implied meaning is that your system is so predictable that you do not need to use those kind of tools. Or it might mean that you didn't stress-test the system enough and it is well behaved in the parts you tested.
Risposte (1)
asaf benjamin
il 6 Apr 2022
Modificato: asaf benjamin
il 6 Apr 2022
0 voti
Categorie
Scopri di più su Univariate Discrete Distributions in Centro assistenza e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!