How to quantify the goodness of a fit?
2 visualizzazioni (ultimi 30 giorni)
Mostra commenti meno recenti
Sofia Flora
il 20 Gen 2023
Commentato: Sofia Flora
il 20 Gen 2023
Hello, I need to quantify how much a fit of a PDF (Probability Density Function) is good. I have my data set with its PDF and its fit. I decided to use the function chi2gof. Since it's the first time I'm using it, I decided to run a test script first.
I decided to generate gaussian (one of the easiest PDFs) random variables with mu=3 and calculate the observed PDF. I know for sure which is the correct PDF. The problem is that when I apply the chi2gof function, I obtain the rejection of the null hypothesis (that says that the probability distribution is the same)! I don't understand what I'm doing wrong. I attach my test code:
ntot=1000;
x=randn([ntot,1])+3;
[bins,hist]=my_hist(x);
hist=hist';
my_O=ntot*hist;
gauss=exp(-0.5*(bins-3).^2)/sqrt(2*pi);
my_E=ntot*gauss;
figure
plot(bins,hist); hold on
plot(bins,gauss)
[h,p,stats] = chi2gof(bins,'Ctrs',bins,'Expected',my_E,'Frequency',my_O)
function [bins,hist]=my_hist(input)
input=input(isfinite(input));
h=histogram(input,'Normalization','pdf');
hist=h.Values;
b=h.BinEdges;
bins=NaN.*ones(length(b)-1,1);
norm=0;
for k=1:length(bins)
bins(k)=(b(k)+b(k+1))/2;
norm=norm+(b(k+1)-b(k))*hist(k);
end
%disp(norm)
close
end
Any help is greatly appreciated!
0 Commenti
Risposta accettata
the cyclist
il 20 Gen 2023
Modificato: the cyclist
il 20 Gen 2023
EDIT: My first posting on this was incomplete, so I radically edited it. Sorry for any confusion if you saw the first version.
I noticed that your expected bin totals my_E do not sum to the value of ntot. The reason for this is that you have mistakenly used gauss as the bin probability, not as the probability density. You need to multiply by the bin width.
You made the same mistake in my_O.
I think you may also be making a mistake in using bin edges where bin centers are expected, but I did not follow up on this.
rng default
ntot=1000;
x=randn([ntot,1])+3;
[bins,hist]=my_hist(x);
hist=hist';
bin_width = bins(2) - bins(1);
my_O=ntot*hist*bin_width;
gauss=exp(-0.5*(bins-3).^2)/sqrt(2*pi);
my_E=ntot*gauss*bin_width;
figure
plot(bins,hist); hold on
plot(bins,gauss)
[h,p,stats] = chi2gof(bins,'Ctrs',bins,'Expected',my_E,'Frequency',my_O)
function [bins,hist]=my_hist(input)
input=input(isfinite(input));
h=histogram(input,'Normalization','pdf');
hist=h.Values;
b=h.BinEdges;
bins=NaN.*ones(length(b)-1,1);
norm=0;
for k=1:length(bins)
bins(k)=(b(k)+b(k+1))/2;
norm=norm+(b(k+1)-b(k))*hist(k);
end
%disp(norm)
close
end
Più risposte (0)
Vedere anche
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!