Gaussian mixture model sometimes seems to fit very badly
Mostra commenti meno recenti
In the following code, I fit a gaussian mixture model (GMM) to some randomly sampled data. I do this twice. Each time, the data represent two well separated gaussians, the only difference being the seed I use for the random number generator.
N = 100000;
EFFECT_SIZE = 5;
seedList = [1 6];
for s = seedList
rng(s)
X = [randn(N,1); randn(N,1)+EFFECT_SIZE];
figure
hist(X,101)
GMModel = fitgmdist(X,2)
end
If you run that code -- you will need the Statistics Toolbox -- you will see that the first distribution is fit very well, and the second one terribly. I am trying to understand why. I would expect such well separated peaks to be fit well essentially every time.
This is not a fluke. I ran 1,000 different seeds, and got the bad fit about 18% of the time. Also, those bad fits tend to cluster relatively close the same parameter values.
Any thoughts? I am a novice at using GMM, so maybe I am just naive about how well this should do.
I am running R2014b on Mac OS X Yosemite.
Risposta accettata
Più risposte (0)
Categorie
Scopri di più su Gaussian Mixture Models in Centro assistenza e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!