Problem with estimating PDF (ksdensity)

5 visualizzazioni (ultimi 30 giorni)
Pepe
Pepe il 13 Feb 2024
Modificato: Abhaya il 8 Ago 2024
Attached are two sets of data and I need to estimate the Probability density function (PDF) for both of them.
The attached variable detection has 32 elements and a unit of percentages (between 0 and 100 %), and the variable in_process has 96 elements and a unit of number of days (between 0 and 212 days).
I want to estimate the PDF of both variables. For that I am using ksdenity, with the 'support' option, because I don't want the values on x-axis to be negative or over 100%.
Therefore,
for the estimation of PDF of the detection I use the following code:
detection(detection==0)=0.0001; %data must be between the support boundaries
detection(detection==100)=99.9999;
pts=0:0.1:100;
[f,x]=ksdensity(detection,pts,'support',[0,100]);
plot(x,f);
and for the estimation of PDF of the in_process I use the same following code:
in_process(in_process==0)=1;
in_process(in_process==212)=211;
pts=0:0.1:212;
[f,x]=ksdensity(in_process,pts,'support',[0 212]);
plot(x,f);
My problem is that the first one looks pretty well (has similar shape as the histogram of detection and looks similar to the PDF that is produced without the support option), while the other one looks bad (creates artificial bumps at the beginning and at the end of the interval).
I don't undestand why is this happening? Why the first one looks good and the second one doesn't.
Is this even a good approach and does it make sense to estimate pdf of these variables?
Thank you for your help.

Risposte (1)

Abhaya
Abhaya il 8 Ago 2024
Modificato: Abhaya il 8 Ago 2024
Hi Pepe,
I understand you want to plot probability distribution functions for ‘detection’ and ‘in_process’.
The steep curves at the end points of the second curve are a result of 'BoundaryCorrection' property of ‘ksdesnsity, which uses ‘log’ as default. Whenksdensitytransforms the support, it introduces the1/xterm in the kernel density estimator. Therefore, the estimate has a peak nearx=0.
However, you can get a smooth curve by setting BoundaryCorrection’ value to ‘reflection’.
[f,x]=ksdensity(in_process,pts,'support',[0 212],'BoundaryCorrection','reflection');
For further understanding please refer to Estimate Density with Boundary Correction’ section of ‘ksdensity documentation.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by