How to include the standard deviation in the fit and propagate the error?

23 visualizzazioni (ultimi 30 giorni)
Hello,
fitting e.g. a monoexponential decay to a dataset with a custom least squares code and retrieving the fit parameters works perfectly fine. However, I run into problems. The data are represented as means + standard deviation. The script minimizes the deviation between the mean and the fitted function. However, the data itself have an uncertainty that is not taken into account during the fitting process. Therefore, I assume that the confidence bounds are actually larger than expected. Is there a way to weight the fit by the standard deviation? I could of course fit the curve twice, once with the lowest values for the std, then with the largest. Thats anything but elegant. Is there a way to solve this issue easily?
Thanks for your help!
Phil :)
  2 Commenti
Jeff Miller
Jeff Miller il 26 Set 2018
It's not entirely clear what data you have. Is this right?
You have a set of discrete x values, x_i, i=1..k, at which you took readings (y's).
For each x_i, you obtained n_i readings, y_ij, i=1..k, j=1..n_i.
You have summarized the readings at each x_i with the mean and standard deviation of the y_ij's at that i: m_i & s_i.
Now you want to fit a model to predict the mean y_i at each x_i, and the fitting should take into account the s_i's.
Phil
Phil il 26 Set 2018
Hi Jeff,
"Now you want to fit a model to predict the mean y_i at each x_i, and the fitting should take into account the s_i's."
Exactly. I would like to fit the model to the mean y'ijs at each x_i taking the standard deviation into account. I attached a figure illustrating the problem.

Accedi per commentare.

Risposte (1)

Jeff Miller
Jeff Miller il 27 Set 2018
For this situation, I think you want to minimize this:
sum_{i=1}^k [ n_i * (m_i - f_i)^2 / s_i^2 ] ,
where f_i is the fitted value corresponding to each mean. Somewhat intuitively, the differences between observed and predicted values are being scaled relative to the standard errors of the various observed means, which are s_i/sqrt(n_i). The idea is that the variability of the y_ij's is only a problem to the extent that it creates uncertainty about the mean values m_i. This uncertainty increases with s_i, but it decreases as n_i increases.
With appropriate assumptions, I think this sum should have an approximate chi-square distribution with (k-p) degrees of freedom, where is the number of free parameters in your model.

Prodotti


Release

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by