# How to find the equation of the data available of a graph?

12 visualizzazioni (ultimi 30 giorni)
Chaudhary P Patel il 15 Gen 2024
Commentato: Alex Sha il 21 Gen 2024
I want the equation for this data. Please, help me.
X Y
0 0
50 1.202051
100 2.412308
150 3.643077
200 4.721709
250 5.34849
300 5.946781
350 6.331396
400 6.488091
450 6.488091
500 6.302906
550 5.975271
600 5.419715
650 4.693219
700 3.753048
750 2.727407
800 1.359886
850 0
900 0
950 0.790085
1000 2.869858
1050 4.650484
1100 5.975271
1150 6.872707
1200 7.456752
1250 7.527977
1300 7.357037
1350 6.687521
1400 5.619145
1450 4.251624
1500 2.556467
1550 0.32
1600 0
##### 0 CommentiMostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

### Risposta accettata

Alex Sha il 15 Gen 2024
Spostato: Dyuman Joshi il 15 Gen 2024
Try the fitting function below:
Sum Squared Error (SSE): 1.05153845767184
Root of Mean Square Error (RMSE): 0.178507147609365
Correlation Coef. (R): 0.997415641383701
R-Square: 0.994837961676859
Parameter Best Estimate
--------- -------------
y0 -6.72273370419413
a1 13.5085640890228
b1 -7.32681575662562E-6
c1 1271.21138671474
a2 13.3029931198928
b2 -3.90619136753063E-6
c2 410.442920391101
a3 -3.56863630006966
b3 -6.4673256414225E-5
c3 896.221576681554
##### 8 CommentiMostra 6 commenti meno recentiNascondi 6 commenti meno recenti
Chaudhary P Patel il 21 Gen 2024
@Alex Sha how are you perofrming this?
I also want to learn it.
Alex Sha il 21 Gen 2024
Hi, once known the data and the fitting function, it is easy to do fitting job in Matlab, for example, by using "lsqcurvefit" command, the only problem is that the initial start-values are hard to be guessed and provided properly, wihch will lead to not converge of fitting.
Rather than Matlab, the results I provided previously are obtained by another package named "1stOpt"，this package apply global optimization algorithm, thus guessing of initial start-values is no longer required. The code looks very simple like below:
Constant n=5; //No. of summation
Function y=y0+Sum(n,a,b,c)(a*exp(b*(x-c)^2));
Data;
0 0
93.07876 0.224404
186.1575 0.476858
262.5298 0.729313
303.1026 0.869565
310.2625 1.290323
....

Accedi per commentare.

### Più risposte (2)

John D'Errico il 15 Gen 2024
I'm sorry, but high order polynomials (as suggested by @akshatsood) are always a bad idea. You will run into numerical problems working in double precision. And as well, people tend to think that if a degree 9 polynomial gave an ok fit, then we can get a better fit from a degree 10, or 12 or 15 degree polynomial. In the end, they end up running into the realm of overfitting their data.
And worst of all, the @Chaudhary P Patel talks about using that equation for prediction in a different interval.
Trying to extrapolate data using a high degree polynomial is just insanity. In fact, trying to extrapolate any model that fits data like that, where you have no rational method for having chosen the model is as crazy. The same applies to the sum of exponentials model produced by @Alex Sha. Just because the model predicts well for the existing data is no reason it will predict anything meaningful for data outside of the support of the data.
In the end, there is no magical way to know the true model that produced any set of data. What is needed is an understanding of the physics of what produced the data. If you do understand the physics, then you can SOMETIMES use that knowledge to suggest a model form, and then you can then use that model form to infer estimates of the parameters that might have existed.
This is why splines exist, and why people use splines as heavily as they do. For example:
xy1 = [0 0
50 1.202051
100 2.412308
150 3.643077
200 4.721709
250 5.34849
300 5.946781
350 6.331396
400 6.488091
450 6.488091
500 6.302906
550 5.975271
600 5.419715
650 4.693219
700 3.753048
750 2.727407
800 1.359886
850 0
900 0
950 0.790085
1000 2.869858
1050 4.650484
1100 5.975271
1150 6.872707
1200 7.456752
1250 7.527977
1300 7.357037
1350 6.687521
1400 5.619145
1450 4.251624
1500 2.556467
1550 0.32
1600 0 ];
spl1 = spline(xy1(:,1),xy1(:,2));
fnplt(spl1,'b')
hold on
plot(xy1(:,1),xy1(:,2),'ro')
Even here though, note that at the right end of that curve, the spline predicts a quick turnaround. Is that true? Perhaps. There is clearly some curvature in the data at that end. But lacking any information to the right, it is IMPOSSIBLE to guess what will really happen there. And since it looks like there is also probably some noise in the data, we need to guess if that is just noise, or if it is real. Most of the time, I would guess some of the bumps in your curve are noise. But maybe they are real signal. We cannot know. (I recall one of my clients who positively knew that if there was a small jiggle in a curve, they ABSOLUTELY needed to know about it, and needed to see it reproduced. But that is not always the case.)
A problem of course is there is no simple function you can write down for the spline I just produced above. But the sum of exponentials model given by @Alex Sha, or the high degree polynomial from @akshatsood are nearly as bad. They will both be terribly sensitive to even small perturbations in the coefficients in those models. And extrapolative predictions of those curves will be just as sensitive.
Again, I'm sorry. But there is no simple way to know the true equation of such a set of data. There is not even a sophisticated, complex way to know that.
##### 1 CommentoMostra -1 commenti meno recentiNascondi -1 commenti meno recenti
Dyuman Joshi il 15 Gen 2024
Modificato: Dyuman Joshi il 20 Gen 2024
@John D'Errico, How would using spline be comparable/different to using 'SmoothingSpline' method of fit?
Other than the fact that the former is a in-built function, whereas the latter requires the Curve fitting toolbox.
% Given data
XY = [ 0 0
50 1.202051
100 2.412308
150 3.643077
200 4.721709
250 5.34849
300 5.946781
350 6.331396
400 6.488091
450 6.488091
500 6.302906
550 5.975271
600 5.419715
650 4.693219
700 3.753048
750 2.727407
800 1.359886
850 0
900 0
950 0.790085
1000 2.869858
1050 4.650484
1100 5.975271
1150 6.872707
1200 7.456752
1250 7.527977
1300 7.357037
1350 6.687521
1400 5.619145
1450 4.251624
1500 2.556467
1550 0.32
1600 0];
X = XY(:,1);
Y = XY(:,2);
% Fit a spline curve
spline_fit = fit(X, Y, 'smoothingspline');
% Generate a finer grid of X values for plotting
X_fit = linspace(min(X), max(X), 100);
% Evaluate the spline at the finer grid of X values
Y_fit = feval(spline_fit, X_fit);
% Plot the original data and the fitted curve
plot(X, Y, 'o', X_fit, Y_fit);
xlabel('X');
ylabel('Y');
legend('Data', 'Fitted Curve');

Accedi per commentare.

akshatsood il 15 Gen 2024
Modificato: akshatsood il 15 Gen 2024
I understand you have a dataset containing X and Y values and you seek a method to deduce a mathematical equation that best represents the relationship between these two variables. Below, I put forward two approaches that would serve as a starting point in achieving the desired equation.
Approach 1 : leveraging the "polyfit" function
To visualize a relationship between X and Y values with an assumption that it can be a polynomial, you can leverage the "polyfit" function to determine the coefficients of the best-fit polynomial. In this approach, you need to experiment with the the degree of the polynomial that would render the optimal relationship between the variables.
Approach 2 : using Curve Fitting Toolbox
In case you have the Curve Fitting Toolbox app, you can get access to a more interactive approach to experiment with various types of fits including linear, polynomial, exponential, etc. and decide upon the optimal choice that associates the variables X and Y.
I hope this helps.
##### 3 CommentiMostra 1 commento meno recenteNascondi 1 commento meno recente
akshatsood il 15 Gen 2024
Modificato: akshatsood il 15 Gen 2024
Have a look at the following code snippet which demonstrates applying a fit to the data provided by you. The equation for the fit and the corresponding coefficients can be found in the variable "y_fit".
% original data (X and Y)
x = [0, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800,
850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, 1300, 1350, 1400, 1450, 1500,
1550, 1600];
y_org = [0, 1.202051, 2.412308, 3.643077, 4.721709, 5.34849, 5.946781, 6.331396,
6.488091, 6.488091, 6.302906, 5.975271, 5.419715, 4.693219, 3.753048, 2.727407,
1.359886, 0, 0, 0.790085, 2.869858, 4.650484, 5.975271, 6.872707, 7.456752,
7.527977, 7.357037, 6.687521, 5.619145, 4.251624, 2.556467, 0.32, 0];
% applying fit using a polynomial of degree 8
y_fit = fit(x(:), y_org(:), 'poly8');
Warning: Equation is badly conditioned. Remove repeated data points or try centering and scaling.
% visualize original data and fit data
plot(y_fit, x, y_org);
I would recommed experimenting with different fit types to arrive at an optimal fit as per your requirements
Chaudhary P Patel il 15 Gen 2024
Spostato: Dyuman Joshi il 15 Gen 2024
sir, more than poly9 it is not going.
sir, but i am not getting equation for it.
my main objective is to to get the value with different interval which is possible when i have the equation of the curve.

Accedi per commentare.

### Categorie

Scopri di più su Descriptive Statistics in Help Center e File Exchange

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by