How to fit a nonparametric distribution to a sample of known percentile values

5 visualizzazioni (ultimi 30 giorni)
Hello everyone
I have a sample of percentile values that describe the distribution of possible earthquake acceleration levels that lead to the failure of a building component. I would like to fit a nonparametric model to these data. I know that, for a random sample of these earthquake acceleration levels, I coiuld fit a nonparametric density using the the ksdensity function but is there a way to do a similar fit for the cumulative distribution function of this function?
Many thanks
Example data:
percentiles = [3 11 27 33 52 66 75 87 92];
acc = [0.3339 0.3595 0.4209 0.4283 0.4645 0.5010 0.5080 0.5713 0.6025];
  5 Commenti
Torsten
Torsten il 14 Ago 2024
And the method doesn't allow to approximate only between acc(1) and acc(9) where 89 % of the mass is cumulated ?
Jeff Miller
Jeff Miller il 14 Ago 2024
@Torsten Not completely. The smoothing would spill over at the edges, so for example the pdf at prctile 91 would depend a bit on what you assumed about the top 8%.

Accedi per commentare.

Risposte (2)

Star Strider
Star Strider il 13 Ago 2024
The empirical cumulative distribution function ecdf would likely bea appropriate here. (There is also ecdf however it seems less applicable to me.) There are a number of associated functions as well, lilnked to in that documentation page.
percentiles = [3 11 27 33 52 66 75 87 92];
acc = [0.3339 0.3595 0.4209 0.4283 0.4645 0.5010 0.5080 0.5713 0.6025];
figure
ecdf(acc, 'Frequency',percentiles)
grid
axis('padded')
[f,x,flo,fup] = ecdf(acc, 'Frequency',percentiles)
f = 10x1
0 0.0067 0.0314 0.0919 0.1659 0.2825 0.4305 0.5987 0.7937 1.0000
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
x = 10x1
0.3339 0.3339 0.3595 0.4209 0.4283 0.4645 0.5010 0.5080 0.5713 0.6025
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
flo = 10x1
NaN 0 0.0152 0.0651 0.1314 0.2407 0.3845 0.5532 0.7562 NaN
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
fup = 10x1
NaN 0.0143 0.0476 0.1187 0.2004 0.3243 0.4764 0.6441 0.8313 NaN
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
.
  8 Commenti
Torsten
Torsten il 14 Ago 2024
Modificato: Torsten il 14 Ago 2024
So you want a smooth version of
percentiles = [3 11 27 33 52 66 75 87 92];
acc = [0.3339 0.3595 0.4209 0.4283 0.4645 0.5010 0.5080 0.5713 0.6025];
plot(acc,percentiles/100)
to get an approximate cdf ? Maybe fit a sigmoid function ?
Star Strider
Star Strider il 14 Ago 2024
The pdf plots might look something like this —
percentiles = [3 11 27 33 52 66 75 87 92];
acc = [0.3339 0.3595 0.4209 0.4283 0.4645 0.5010 0.5080 0.5713 0.6025];
[f,x,flo,fup] = ecdf(acc, 'Frequency',percentiles);
dfdx = gradient(f, x);
dpda = gradient(percentiles/100, acc);
figure
stairs(x, dfdx, 'DisplayName','From ‘ecdf’ Results')
hold on
stairs(acc, dpda, 'DisplayName','From Posted Vectors')
hold off
grid
xlabel('$x$', 'Interpreter','LaTeX')
ylabel('$\frac{dF(x)}{dx}$', 'Interpreter','LaTeX', 'FontSize',14)
legend('Location','best')
.

Accedi per commentare.


Image Analyst
Image Analyst il 14 Ago 2024
You could fit a spline through them. The spline doesn't take any parameters, it just fits a cubic equation between each pair of points. See attached demo.
  1 Commento
Xavier
Xavier il 14 Ago 2024
Thanks for this idea, but fitting a spline does not ensure that the fitted function will comply with the necessary conditions for being a CDF.

Accedi per commentare.

Categorie

Scopri di più su Interpolation in Help Center e File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by