Line fitting slope error estimation from y-error affected datasets

17 visualizzazioni (ultimi 30 giorni)
I am trying to fit data points which are affected by error (in the y axis only) to a line. I managed to do it both with "polyfit" and with "polyfitn". Though, aside from the best fit line, I would also like to have the lines (and all the coefficients) with the lowest possible and highest possible slopes, which originate from the errors in the dataset. Is there are a way to input them in polyfitn? The .ParameterStd coefficient I get in polyfitn, as far as I understood, is related to how the line is drawn(?)

Risposte (1)

John D'Errico
John D'Errico il 11 Giu 2017
I'm not exactly positive what you are really asking to do here.
x = rand(1,100)*10 - 5;
y = rand(1,100);
P = polyfitn(x,y,1)
P =
struct with fields:
ModelTerms: [2×1 double]
Coefficients: [0.017113 0.5481]
ParameterVar: [6.8701e-05 0.00062619]
ParameterStd: [0.0082886 0.025024]
DoF: 98
p: [0.041604 1.5483e-39]
R2: 0.041682
AdjustedR2: 0.031903
RMSE: 0.24146
VarNames: {'X1'}
So the slope was estimated as 0.017113. With 98 degrees of freedom for the estimation, there is no need to delve into a t-statistic. +/- 2 sigma will be a good approximation for 95% limits on the slope.
tinv(0.975,98)
ans =
1.9845
Asking for 3 significant digits on these limits is silly, given all of the approximations that go into these computations, and given noise in the data. Hardly any model will ever really be that good of an approximation anyway.
x0 = [-5,5];
plot(x,y,'o',x0,P.Coefficients(2) + P.Coefficients(1)*x0,'r-')
hold on
plot(x0,P.Coefficients(2) + (P.Coefficients(1)-2*P.ParameterStd(1))*x0,'g-')
plot(x0,P.Coefficients(2) + (P.Coefficients(1)+2*P.ParameterStd(1))*x0,'g-')
grid on
If you are looking for prediction intervals around the curves, that is slightly different. There we will see smooth curves for an envelope on the fit. But all you asked for are the range of slopes that are consistent with the data, here shown to within roughly 95% limits.
  1 Commento
Umberto Maria Ciucani
Umberto Maria Ciucani il 12 Giu 2017
Dear John,
thanks for your reply. I'm sorry for not being clear.
The dataset you are using in your example is not affected by error. I have a dataset consisting in 5 points which are affected by error in the y coordinate. I need to fit this dataset with a line and I am interested into the slope. The error affecting each point of the data, can potentially affect the slope of the line. I would like to compute the lines with the steepest and gentlest lines that could be drawn, when taking the errors of the data points into account, so that I can have a maximum and a minimum limit for my slope evaluated on the actual points.
I thought about creating a dataset made of points whose y position are at the maximum and minimum values of the errors for each point but I thought there could be a way to evaluate this using your very same polyfitn toolbox.
I am attaching the figure.

Accedi per commentare.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by