How to fit this model with a Weibull distribution?

Hi. I'm trying to fit my empirical curve with a theoretical Weibull distribution through curve fitting (cftool). The problem is that I get a negative value and I can't understand why.
I tried adding a +c constant to the Weibull function, but it only got worse:
I also tried adjusting the variables' StartingPoint values but nothing changed. I am pretty sure this has to work somehow. Thank you in advance.

2 Commenti

Torsten
Torsten il 27 Set 2024
Modificato: Torsten il 27 Set 2024
Do you have the original data from which you assume that they follow a Weibull distribution ?
If yes: Apply "mle":
Hi. I do not claim my data follows a Weibull distribution. However, I at least would like to to see a valid (i.e. non-negative) R-squared value since I need it.

Accedi per commentare.

 Risposta accettata

I normalized the x-axis data to facilitate the fitting algorithm's search process. Is this solution acceptable?
%% data
X = load("x_axis.mat");
Y = load("y_axis.mat");
%% process a bit
t = X.t;
tm = max(X.t) % max t for normalization; makes the fitting easier
tm = 299578
y = Y.M_empRel;
% Fitting model
fo = fitoptions('Method', 'NonlinearLeastSquares',...
'Lower', [0.02, 0.5],...
'Upper', [0.03, 0.6]);
ft = fittype('exp(-((t/299578)/b)^c)', ...
'dependent', {'prob'}, 'independent', {'t'}, ...
'coefficients', {'b', 'c'}, ...
'options', fo);
%% Fit curve to data
[yfit, gof] = fit(t, y, ft, 'StartPoint', [0.025, 0.55])
yfit =
General model: yfit(t) = exp(-((t/299578)/b)^c) Coefficients (with 95% confidence bounds): b = 0.02498 (0.02459, 0.02538) c = 0.5535 (0.5464, 0.5606)
gof = struct with fields:
sse: 0.3121 rsquare: 0.9915 dfe: 471 adjrsquare: 0.9915 rmse: 0.0257
%% Plot results
plot(yfit, t, y), grid on
xlabel('t'), ylabel('y(t)')
title({'$y(t) = \exp\left(- \left(\frac{t}{299578 b}\right)^{c}\right)$, with $b = 0.02498$ and $c = 0.5535$'}, 'interpreter', 'latex', 'fontsize', 16)
legend('Data', 'Fitted model', 'fontsize', 14)

4 Commenti

Michele
Michele il 28 Set 2024
Modificato: Michele il 28 Set 2024
Hi, thank you for the detailed answer! I have a couple of questions: where did you get those StartingPoint, Lower and Upper values from? And also, why did the normalization made it work? Thank you!
Normalized data is effective in this case because the algorithm does not need to "work so hard" to find the parameters that fit the data from to .
When I initially fitted the normalized model to the data, I did not supply any lower and upper bounds to see if the algorithm could find the best-fitting parameters. Since the normalized data is now contained within the "box" , I typically guessed the starting point parameter values to be close to 0, 1, or the midpoint between 0 and 1. The lower and upper values in my response were provided after I successfully fitted the model, in order to demonstrate that for proper fitting, you should supply the lower and upper bounds.
Please see the example below (my initial fitting). Let us know if you need to investigate the curve-fitting problem further.
%% data
X = load("x_axis.mat");
Y = load("y_axis.mat");
%% process a bit
t = X.t;
tm = max(X.t) % max t for normalization; makes the fitting easier
tm = 299578
y = Y.M_empRel;
% Fitting model
fo = fitoptions('Method', 'NonlinearLeastSquares');
ft = fittype('exp(-((t/299578)/b)^c)', ...
'dependent', {'prob'}, 'independent', {'t'}, ...
'coefficients', {'b', 'c'}, ...
'options', fo);
%% Fit curve to data
[yfit, gof] = fit(t, y, ft, 'StartPoint', [0.01, 1])
yfit =
General model: yfit(t) = exp(-((t/299578)/b)^c) Coefficients (with 95% confidence bounds): b = 0.02498 (0.02459, 0.02538) c = 0.5535 (0.5464, 0.5606)
gof = struct with fields:
sse: 0.3121 rsquare: 0.9915 dfe: 471 adjrsquare: 0.9915 rmse: 0.0257
%% Plot results
plot(yfit, t, y), grid on
xlabel('t'), ylabel('y(t)')
title({'$y(t) = \exp\left(- \left(\frac{t}{299578 b}\right)^{c}\right)$, with $b = 0.02498$ and $c = 0.5535$'}, 'interpreter', 'latex', 'fontsize', 16)
legend('Data', 'Fitted model', 'fontsize', 14)
Thank you very much for your help and dedication!

Accedi per commentare.

Più risposte (2)

If this is data that you think comes from a Weibull, then you do not want to use regression techniques to fit the distribution. And adding a constant term invalidates the result as a Weibull distribution.
You can use MLE. But more approprately, use wblfit.
help wblfit
wblfit - Weibull parameter estimates This MATLAB function returns the estimates of Weibull distribution parameters (shape and scale), given the sample data in x. Syntax parmHat = wblfit(x) [parmHat,parmCI] = wblfit(x) [parmHat,parmCI] = wblfit(x,alpha) [___] = wblfit(x,alpha,censoring) [___] = wblfit(x,alpha,censoring,freq) [___] = wblfit(x,alpha,censoring,freq,options) Input Arguments x - Sample data vector alpha - Significance level 0.05 (default) | scalar in the range (0,1) censoring - Indicator for censoring array of 0s (default) | logical vector freq - Frequency or weights of observations array of 1s (default) | nonnegative vector options - Optimization options statset('wblfit') (default) | structure Output Arguments parmHat - Estimate of parameters 1-by-2 row vector parmCI - Confidence intervals for parameters 2-by-2 matrix Examples openExample('stats/EstimateParametersOfWeibullDistributionExample') openExample('stats/EstimateParametersOfWeibullDistributionCIExample') openExample('stats/EstimateWeibullParametersOptionsExample') See also mle, wbllike, wblpdf, wblcdf, wblinv, wblstat, wblrnd, wblplot Introduced in Statistics and Machine Learning Toolbox before R2006a Documentation for wblfit doc wblfit
If you provide the data, we can possibly offer more or better advice. Lacking that, just use wblfit.

13 Commenti

Hi. I do not claim my data follows a Weibull distribution. However, I at least would like to to see a valid (i.e. non-negative) R-squared value since I need it.
Michele
Michele il 27 Set 2024
Modificato: Michele il 27 Set 2024
Here is the data, attached to the comment.
Your data look as if they follow a uniform distribution on [0 1]. Did you exchange the data from your above plot ?
x = load("distrib.mat");
histogram(x.M_empRel)
Sorry, I forgot to give you the values on the x axis (time). Look at the attachment of this message, it contains a file with the x axis values and a file with the y axis values.
What are the random values you want to fit a distribution to ? They don't have an x- and and y-value, but are one-dimensional.
Otherwise you have (x/y) data that you want to fit a function to.
Read about the difference between distribution fitting and curve fitting:

I am doing curve fitting. Specifically, I'm using the Curve Fitting Tool:

cftool(x, y);
Torsten
Torsten il 27 Set 2024
Modificato: Torsten il 27 Set 2024
And why is the first sentence of your question
I'm trying to fit my empirical distribution with a theoretical Weibull distribution.
?
This cries for distribution fitting.
I'm sorry for the incorrect phrasing. I edited the original post.
The integral of a Weibull distribution should be 1, yours is appr. 13000. So I think there must be a fundamental misunderstanding about what you are supposed to do with the data.
x = load("x_axis.mat");
y = load("y_axis.mat");
plot(x.t,y.M_empRel)
trapz(x.t,y.M_empRel)
ans = 1.3080e+04
I have to try and fit that curve with different distributions, and compare the results. With an exponential distribution, this is what I get:
Reasonable values for SSE and R-squared.
I'm now trying to do the same with the Weibull distribution (more specificaly, with the 1-CDF of the Weibull distribution). So on the Curve Fitting tool I select "Custom Equation" and I enter:
exp(-(x/b)^c)
But I get a negative R-squared and I don't get why.
I also tried adding a constant:
exp(-(x/b)^c)+K
But I get this:
Torsten
Torsten il 28 Set 2024
Modificato: Torsten il 28 Set 2024
As said, you can't fit your data with a distribution function since the integral of distribution functions from 0 to Inf is 1 whereas yours is appr. 13000.
And I don't understand why the blue curve does not pass through (0/1) if you took exp(-(x/b)^c) as fitting function.
Michele
Michele il 28 Set 2024
Modificato: Michele il 28 Set 2024
Then why did the exponential fit work? Besides, what am I supposed to do then to make it work with a Weibull distribution? Scale down my values/data?
The exponential doesn't hit 1 at x=0 because of the constant +K added to the model.
Then why did the exponential fit work?
Because a*exp(b*x) is not a distribution function for arbitrary choices of the parameters a and b.
Besides, what am I supposed to do then to make it work with a Weibull distribution? Scale down my values/data?
It will be difficult with a modified Weibull function because only in a very special case it passes through (0/1).

Accedi per commentare.

David Goodmanson
David Goodmanson il 27 Set 2024
Modificato: David Goodmanson il 27 Set 2024
Hi MIchele,
One possibility is that you are not fitting the correct statistical function to the data. The algebraic expression you have is a probability density function (pdf), but the data runs from y = 1 down to y = 0 [eventually], and looks like a cumulative distribution function (cdf), more specifically (1-cdf) since a cdf runs from 0 up to 1. For the weibull cdf,
y = 1-exp(-(x/b)^c) and (1-y) = exp(-(x/b)^c)
and the latter function you can fit to the data.
When x=0 then y=1, and when x=b then y = exp(-1) = .368 for all vaues of c. Looking at your plot, .368 occurs close to x = 1 which implies that b is approximately 1.
I guess another possibility is that y is a pdf that so happens to equal 1 at x = 0. This occurs for the weibull pdf when c = 1, which is the same as the pdf of an exponential distribution, (1/b) exp(-x/b).

3 Commenti

Hi, thanks for the answer. You are actually right.
However, using that function you provided, still comes out negative. By adding an arbitrary constant +q I get:
Which is still odd to me...
I plotted the data you enclosed vs a linear scale and it does not look in the least like the data in the plot.
Sorry, I forgot to give you the values on the x axis (time). Look at the attachment of this message, it contains a file with the x axis values and a file with the y axis values.

Accedi per commentare.

Richiesto:

il 27 Set 2024

Commentato:

il 29 Set 2024

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by