why my cdf does not match cumsum(pdf)

x=1D array of 100 discrete values that I calculate Rayleigh pdf and cdf as below
pdfrayl = 2 * x .* exp(-x .^ 2);
cdfrayl = 1 - exp(-x .^ 2);
Now I plot following two lines:
plot(x, cdfrayl)
hold on;
plot(x, cumsum(pdfrayl )/ sum(pdfrayl))
I expected them to match exactly, but they don't. Can anybody please explain why they don't match?

 Risposta accettata

It does not appear to me that you are approximating the Riemann sum of the integral of the PDF correctly here.
x = 0:0.01:10;
y = x/4.*exp(-x.^2/8);
% \Delta x for the Riemann sum
dx = 10/length(x);
yc = cumsum(y).*dx;
yct = 1-exp(-x.^2/8);
plot(x,yc,'r-.','linewidth',2);
hold on;
plot(x,yct,'b');
legend('Approximation of CDF','True CDF', ...
'Location','SouthEast');

Più risposte (1)

Russ Adheaux
Russ Adheaux il 15 Set 2012

0 voti

Thanks. What if my x vector is a random array that is not equally spaced (no single dx),?I guess I will have to sort x and use dx between consecutive x values in cumsum. Correct?

3 Commenti

The dx is coming the length of the interval max(x)-min(x) divided by the number of points. Yes, you would have to sort the x. Do you know the ecdf function?
For unevenly spaced, monotonically-increasing x-data, I suggest trapz or cumtrapz, specifying both x and y data as arguments. It then allows you to integrate with respect to any x-variable spacing, and is more accurate than cumsum.
Hmmm...just learned about ecdf. I was using the following code to get cdf of my x data. Is there an advantage to use ecdf directly? My ultimate goal is to fit a user-defined distribution function to this data.
xbins=[0:dx:5]; [n,xout] = hist(x,xbins); yout=n/sum(n)/dx; cdfy=cumsum(yout*dx);

Accedi per commentare.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by