Why I am unable to recreate curve fitting equation?

Question

Sunil il 31 Mag 2020

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/539063-why-i-am-unable-to-recreate-curve-fitting-equation

Modificato: John D'Errico il 1 Giu 2020

I used MATLAB's inbuilt Curve Fitting Tool to fit following data:

x = [5 10 15 20 25 30 35 40 45 50]

and

y = [140 88 62 49 38 31 25 20 17 12]

I used two term exponential equation to generate fitting curve.

Following results were obtained:

General model Exp2:

f(x) = a*exp(b*x) + c*exp(d*x)

where x is normalized by mean 27.5 and std 15.14

Coefficients (with 95% confidence bounds):

a = 0.2758 (-0.1069, 0.6585)

b = -3.521 (-4.346, -2.696)

c = 34.03 (32.91, 35.15)

d = -0.6419 (-0.6992, -0.5846)

Goodness of fit:

SSE: 3.376

R-square: 0.9998

Adjusted R-square: 0.9996

RMSE: 0.7501

I recreated the equation of the curve using the same coefficients a = 0.2758, b = -3.521, c = 34.03 and d = -0.6419 in equation y1 = a*exp(b*x) + c*exp(d*x) and run it in the command window I get following out put of y1 :

y1 =

1.374022395352651

0.055478622562340

0.002240049060184

0.000090446005331

0.000003651919963

0.000000147452830

0.000000005953673

0.000000000240390

0.000000000009706

0.000000000000392

I am unable to understand why there is such a big mismatch in y1 and y ?

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

John D'Errico il 1 Giu 2020

Modificato: John D'Errico il 1 Giu 2020

READ MY ANSWER TO THE END. It solves your problem, showing what you did, and reproducing the garbage numbers you got for y1, pretty much exactly.

Essentially, the problem you have in producing y1 is IF you do a fit using a normalized version of x in fit, then you need to build that normaization into your model. It is now part of your model.

The proof is that when I did the fit using the normalized version of x, it produces the same coefficients you got. So the problem is NOT in the fit itself, because i can then predict y pretty accurately, even if I use only the approximate set of coefficients as did you.

However, when you then predict the model, you need to use the narmalization used for the fit!

The problem is NOT how you estimated the model.

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Star Strider il 31 Mag 2020

1
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/539063-why-i-am-unable-to-recreate-curve-fitting-equation#answer_443679

Apri in MATLAB Online

I have no idea what the problem is. The fminsearch function had no probllem with it.

The Code —

f = @(b,x) b(1).*exp(b(2).*x) + b(3).*exp(b(4).*x);
x =  [5     10     15     20    25    30    35    40    45    50];
y = [140    88    62    49    38    31    25    20    17    12];
B = fminsearch(@(b) norm(y - f(b,x)), rand(4,1));
figure
plot(x, y, 'p')
hold on
plot(x, f(B,x), '-r')
hold off
grid
text(27, 100, sprintf('a = %7.3f\nb = %7.3f\nc = %7.3f\nd = %7.3f',B))

The Plot —

3 Commenti
Mostra 1 commento meno recenteNascondi 1 commento meno recente

Star Strider il 31 Mag 2020

I do not have the Curve Fitting Toolbox because the others that I have (Statitistics and Machine Learning Toolbox, Optimization Toolbox) plus others, and my own mathematical and programming experience do everything I want.

Other than that, we know only what you said you did, not what you actually did. It is not possible to determine the problem. (I reversed the two vectors and my function still ran without error. The fit was appropriate and the parameters were different, however they did not even closely resemble the parameters you previously reported, eliminating that as a source of the problem.)

My code gives the correct result. Use my ‘f’ function with nlinfit and nlparci to get equivalent resultls, with confidence intervals.

Alex Sha il 1 Giu 2020

Apri in MATLAB Online

The results below seem to be more better:

Root of Mean Square Error (RMSE): 0.581032486515321
Sum of Squared Residual: 3.37598750386176
Correlation Coef. (R): 0.999881471978295
R-Square: 0.999762958005483
Parameter	Best Estimate
----------	-------------
a	165.438803583087
b	-0.232608352963485
c	109.208242653762
d	-0.0424020153019308

Accedi per commentare.

Answer 2

John D'Errico il 1 Giu 2020

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/539063-why-i-am-unable-to-recreate-curve-fitting-equation#answer_443899

Modificato: John D'Errico il 1 Giu 2020

Apri in MATLAB Online

I had to play with this for a while, because my first assumption was that you were using the wrong coefficients. In fact, while that costs you some, it is not what destroyed your results. That can down to forgetting to use the normalized variable x in your computation. You CANNOT use a normalized x in the fit, but then not use the same normalization to predict y.

In fact, using 4 digit approximations is a classic problem. People think that a 4 digit approximation to a coefficient is the coefficient. It is not. Just because you see the number reported to 4 significant digits, it does not stop there.

My initial assumption is the problem you had is NOT the software used to estimate the model, but nothing more than using the wrong coefficients.

format long g
>> mu = mean(x)
mu =
                      27.5
>> S = std(x)
S =
          15.1382517704875
>> xhat = (x - mu)/S;
>> mdl = fit(xhat',y','exp2')
mdl = 
     General model Exp2:
     mdl(x) = a*exp(b*x) + c*exp(d*x)
     Coefficients (with 95% confidence bounds):
       a =      0.2758  (-0.1069, 0.6585)
       b =      -3.521  (-4.346, -2.696)
       c =       34.03  (32.91, 35.15)
       d =     -0.6419  (-0.6992, -0.5846)

As you should see, these are exactly the same set of coefficients you claim to have gotten.

plot(x,mdl(xhat))
hold on
plot(x,y,'ro')

Again, those 4 significant digit approximations to the coefficients are NOT the coefficients. You always need to use the true values as estimated.

mdl.a
ans =
         0.275764176155343
>> mdl.b
ans =
         -3.52133177155047
>> mdl.c
ans =
          34.0286408362909
>> mdl.d
ans =
        -0.641895329124188

You need to use the full precision. And make sure you use the correct value for the normalizations too. Don't use a 4 digit approximation. If you do, then expect to get what is potentially random crapola.

I would have gotten the correct result also had I done this as:

ypred = mdl.a*exp(mdl.b*(x - mu)/S) + mdl.c*exp(mdl.d*(x - mu)/S);

In fact, this will give exactly the same predictions, as I claim it must. This I can verify.

norm(ypred' - mdl((x - mu)/S))
ans =
       1.4210854715202e-14

To prove the problem is, in the end, just 4 digit approximations to the coefficients, let me now try doing exactly that.

aappr =      0.2758;
bappr =      -3.521;
cappr =       34.03;
dappr =     -0.6419; 
Sappr = 15.14;
muappr = 27.5;
yappr = aappr*exp(bappr*(x - muappr)/Sappr) + cappr*exp(dappr*(x - muappr)/Sappr);

However, when I plot that 4 digit approximation, I still get something that is not too far off. However, As you see, I got exactly the correct fit, because I did my fit the same way you did, by fitting using a normalized version of x.

Now, let me compute the prediction, but NOT using the normalized version of x. After all, you computed it using a NORMALIZED X!!!!!!!

ywrong = aappr*exp(bappr*x) + cappr*exp(dappr*x);

When you computed y1, you did not use the normalized version of the vector x. Now, let me show the results. LOOK CAREFULLY AT THE COLUMNS.

format short g
[y',ypred',yappr',ywrong']
ans =
          140       140.05       139.99        1.374
           88       87.627       87.612     0.055479
           62       62.864       62.861      0.00224
           49       48.347       48.347   9.0446e-05
           38       38.327       38.328   3.6519e-06
           31        30.76       30.762   1.4745e-07
           25       24.807       24.809   5.9537e-09
           20       20.044       20.046   2.4039e-10
           17       16.207       16.209   9.7062e-12
           12       13.109        13.11    3.919e-13

Column 1 is the real data.

Column 2 are my predictions using the correct set of coefficients.

Column 3 is my predictions using the incorrect set of coefficients. As you can see, while it is incorrect, the differential is not as large as what you reported. In fact, surprsingly, it is not that far off. There are relatively small errors, but not huge errors.

Column 4 is what happens if you use the UNNORMALIZED VERSION OF X.

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Why I am unable to recreate curve fitting equation?

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Risposte (2)

3 Commenti
Mostra 1 commento meno recenteNascondi 1 commento meno recente

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

Why I am unable to recreate curve fitting equation?

1 Commento Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Risposte (2)

3 Commenti Mostra 1 commento meno recenteNascondi 1 commento meno recente

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

3 Commenti
Mostra 1 commento meno recenteNascondi 1 commento meno recente

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti