How to change confidence intervals using fit and Curve Fitting App?

Hello, I have an issue fitting some data to a simple y=ax equation. The issue, I believe, is that with 95% confidence, the fit is correct, but I'd like to improve it. In the below pictures, you can see the four data points I'm trying to fit to a straight line. If I chose the starting point for a as -0.1, it says that -0.1 is good and within 95% confidence. If I set it as -0.5, the same thing happens. How can I improve this if possible?
The second and fourth data point may be bad, meaning that I'm expecting the answer to this to be closer to the -0.5.
Additionally, the data that is being plotted here, from a physical sense, should pass through 0,0.

 Risposta accettata

Matt J
Matt J il 12 Apr 2022
Modificato: Matt J il 12 Apr 2022
The confidence intervals are a function of the input x,y data and the model function, and nothing else. The confidence interval tightness that you can achieve depends entirely on how well the x,y data agrees with the model you've specified.
If the 2nd and 4th data points are bad, you should remove them. I do not see any indication that the curve fails to pass through (0,0).

8 Commenti

Hi, thank you for your thoughtful response- yes, I think maybe the tightness is what I need to improved! I'm not sure how though. To further illustrate my issues with the confidence interval being too large, please see the below graphs.
First, here is the data, outliers taken out, fit to y=ax, which should be the expression appropriate when you expect linear data to pass through [0,0] naturally. You can see the a value is happy to sit at the inital guess of -0.1. Personally, I think the least squares error could be improved if the slope was lower, ie -0.2 as it would bring the line closer to both data points.
As a check to see how good or bad the data fits the ideal [data with a 0,0 intercept] I also tried fitting it to y=ax+b and get the following line
Here, you can see the algorithm for minimizing error got quite lazy. BOTH if the value of b was closer to zero, and the slope was closer to 0, the fit would be much improved.
Finally, if I remove those outlier datapoints, and include a 0,0 data point, the fit should look much closer to what I'd like it to look like...
but as you can see, the b value is improved, but again, the slope [which is the value I care most about extracting] is still horribly off, but likely still within 95% confidence before the algorithm gave up refitting.
How can I ask Matlab to continute fitting past the point where it has reached fit parameters with 95% confidence?
Personally, I think the least squares error could be improved if the slope was lower, ie -0.2 as it would bring the line closer to both data points.
That is something you could easily check. Did you calculate the least squares error for both a=-0.1 and a=-0.2 and compare them?
How can I ask Matlab to continute fitting past the point where it has reached fit parameters with 95% confidence?
When the curve fitting app decides to stop its iterative search is unrelated to the 95% confidence thing. It is based on the TolX and TolFun fit options,
Normalizing your data with the Normalize option might also be a good idea.
Because you have not displayed the complete plots, it is hard to see if the iterations have really stopped too early. However, one thing that may help is to solve for a and b analytically, instead of using the Curve Fitting App's iterative calculation. Both of your fitting models have simple analytical solutions. For y=a*x, it is just,
a=x(:)\y(:)
and for y=a*x+b, it is
ab=polyfit(x,y,1);
a=ab(1);
b=ab(2);
Alterantively, you could use the Curve Fitting app's poly11 model instead of a custom model. I don't think it uses iterative calculation for that. You can constrain b=0 in the builtin y=a*x+b model by setting the upper and lower bounds on b to zero (bounds are also set through the fit options) .
For your second response, yes thank you, I've just tried changing the tolerances as the x and y values are both on very small scales, see the righthand fit options. This fit still stayed around the initial value which is sad, not sure if I should be going even lower?
As for the first response, yes. If you compare the above and below images where the start point was changed, the r-square value goes up as would be expected from a fit line that more closely passes the data. So there is definitely a better fit to be acheived.
I'm curious about what a 'complete plot' is, but upon hitting "generate code" and setting the display option to "iter" I return that only a 0th iteration was performed. I truly only have 4 data points, two of which were removed, and a theoretical 0,0 point.
ft = fittype( 'a*x', 'independent', 'x', 'dependent', 'y' );
opts = fitoptions( 'Method', 'NonlinearLeastSquares' );
opts.DiffMinChange = 1e-15;
opts.Display = 'iter';
opts.StartPoint = -0.1;
opts.TolFun = 1e-12;
opts.TolX = 1e-12;
I am not sure how Normalizing would help but I did try the polynomial fit which, alone was great for fitting the ax+b data that does and does not contain the 0,0 point. I think I will switch to this method for those two fits.
However, when constraining the b value as you mentioned, the fit begins to behave very strangely where it doesn't go through 0,0 despite being constrained.
I feel like I must be using the TolX and TolFun fields incorrectly because the top two images I've just posted here look like they are the first two itterations with -0.1 and -0.8 as the first two guesses. Do you have any feedback on how I may be using the TolX and TolFun fields wrong?
It would be easiest if you would post your x,y data in a .mat file.
Attached!
Also, if I simply write out the following code and fit a parbola to it,
load xydata
a = linspace(0,-5,100000);
resSq=zeros(1,length(a));
for b=1:length(a)
yfit=a(b)*xdata;
resSq(b)=((ydata(1)-yfit(1))^2+(ydata(2)-yfit(2))^2+(ydata(3)-yfit(3))^2);
end
figure
plot(a,resSq);
I return a minima at the correct value, so I alkso know what value I'm hoping matlab returns!
I return a minima at the correct value, so I alkso know what value I'm hoping matlab returns!
If so, you should just use the analytical formula that I gave you above. It returns the same result:
load xydata
x=xdata(2:end)'; y=ydata(2:end)'; %get rid of the artificial (0,0) point
a=x\y
a = -0.7823
Here's how you could do it as well with the Curve Fit Toolbox. Notice that I re-express x,y in smaller units first. That way, the default TolX and TolFun will work better.
cfit=fit(x/1e-6,y/1e-6,'poly1','Lower',[-inf,0],'Upper',[0,0])
cfit =
Linear model Poly1: cfit(x) = p1*x + p2 Coefficients (with 95% confidence bounds): p1 = -0.7823 (-2.35, 0.7852) p2 = 0 (fixed at bound)
Hmm. So you're saying, for y=ax, a=x(:)\y(:) is enough, and if I need to ignore data x(pointToRemove)=[] and y(pointToRemove)=[] is the way to go. For the y=ax+b, polyfit will default to better fitting methods than the general algorithms in the custom equation of the curve fitting app? Thank you very much for your kind answers the past few days, and your patience with me taking an extra comment to understand why it is most efficient to just get out of iterative fitting. I am going to incoorporate your suggestions into my code!
Thank you also for the suggestion about just increasing the data values, I think that is equally reasonable!
Matt J
Matt J il 14 Apr 2022
Modificato: Matt J il 14 Apr 2022
Yes, but even with analytical solvers like polyfit, it is still good to measure your data in natural units, where natural means that they aren't all super large or super small, or with orders of magnitude difference between the x and y values.

Accedi per commentare.

Più risposte (0)

Categorie

Prodotti

Release

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by