The difference of SSE and MAE curve fitting and optimization

Question

Shaily_T il 9 Mag 2022

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/1714600-the-difference-of-sse-and-mae-curve-fitting-and-optimization

Commentato: William Rose il 27 Mag 2022

I am trying to fit some series of data taken at different frequencies to a custom model. But it doen't get reasonable result and fit for some of data series. The data I have is a nonlinear system with multiple distinct peaks. And the fitting function is essentially a fraction where I do have multiplication of two exponential functions in both nominator and denominator and one of the exponential functions are an exponential function of a series of Gaussian functions where the fitting parameters are the amplitude and width of each Gaussian and the distance between the Gaussians and the last fitting parameter is the argument of the other one of the two exponential functions.

I have tried curve fitting tool matlab (used both Trust-region and levenberg-marquardt algorithms) and also optimization tool (along with multistart and globalsearch) where I tried to minimize the SSE between the actual data and the predicted value from the model. But none of them helped.

I am wondering is minimizing SSE a good measure for optimization in my case? Also, in curve fitting tool sse is one of the measures of the goosness of fit.

I know about another measure MAE (minimum of absolute errors) but I am not sure how it is defined properly and if minimizing it is more helpful than minimizing sse or not or essentially I will see the same things that I see for SSE.

I appreciate your comments and suggestions!

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

William Rose il 9 Mag 2022

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/1714600-the-difference-of-sse-and-mae-curve-fitting-and-optimization#answer_959960

Apri in MATLAB Online

fitdata1Dexample.m

@Shaily_T,

In most curve-fitting problems, you will not see a big difference when you minimize SSE versus minmize sum of absolute errors (SAE). Minimization of SSE has the nice property that, if the errors are normally distributed, then minimizing the SSE produces a maximum likelihood estimate. Which is nice from a statistical viewpoint. Minimizing SAE has the advantage that it is more robust, that is, it is less skewed by outliers.

You can minimize SAE by defining a function that returns the SAE, then using fmincon() to find the parameters that minimize the SAE function.

I am attaching an example script that fits three parameters, by minimizing SSE and by minimizing SAE. The script includes an SSE function and an SAE function. fmincon() is used two times: first to find the parameters that minimize SSE, then to find the parameters that minimize SAE. Both sets of fitted parameters are displayed, along with the true parameter values. Here is an example of the console output produced by the script:

>> fitdata1Dexample

Fitting results:

Minimize SSE: Fitted a,b,c = -1.904 5.036 0.424

Minimize SAE: Fitted a,b,c = -1.855 5.043 0.373

True values: a,b,c = -2.000 5.000 0.500

You will get different results each time you run it, since the random noise is different every time.

The function to compute SAE is

%% SAE function
function SAE = sumabserr1D(params)
%SUMABSERR1D Sum of absolute error between data and model prediction
%   y=vector of data
%   a,b,c = model parameters
%   x = locations at which the function model1D() should be evaluated
%   model1D() = function describing the model
%   Function sumabserr1D() is to be minimized in a model fitting routine
%   WCRose 2022-05-08
global x y;
a=params(1);
b=params(2);
c=params(3);
SAE=sum(abs(y-model1D(x,a,b,c)));
end

where model1d() is defined in the script.

Good luck.

8 Commenti
Mostra 6 commenti meno recentiNascondi 6 commenti meno recenti

Shaily_T il 10 Mag 2022

Thank you so much for your explanation and the attached code! So, if I want to have MAE it is the same as SAE but I should use mean instead of sum. Is that correct?

About your second comment actually I am not sure. I can elaborate more on what I have.

I have 15 series of experimental data taken at different frequencies. I am trying to fit each series of data which is taken at different frequency domains to a function and obtain the value of the fitting parameters (I have 4 fitting parameters) and use the obtained values of the fitting parameters to calculate the efficiency of the system by another equation and compare it with the experimental efficiency. The problem is for some of these series of data, I don't obtain a good fit and so the obtained efficiency is far different from the experiment. I think one problem is I see the fitted curve with better goodness of fit measures (i.e. SSE) is not the best fit in terms of what I see and also the calculated efficiency. I have also tried the optimization toolbox algorithms and solvers and I also tried to minimize SSE function there but it didn't help. So, I was thinking maybe I should minimize something else (i.e. MAE, SAE) instead of SSE. But from your example and explanation it seems to me it will not make a huge change.

I have attached two fitted curves to the same data with different values of SSE. From the obtained efficiency and what I see I think the one with worse SSE is a better fit.

Thanks for your time!

William Rose il 10 Mag 2022

@Shaily_T,

Yes, use mean instead of sum, if you prefer MAE. Changing from SAE to MAE will not affect the fit, because the parameters that minimize SAE will also minimize MAE, and vice versa.

It is interesting and a bit disappointing that the parameters that give the best fit lead to efficiency predicitons that are quite different from the experimental observations.

Be careful when evaluating a best fit by eyeballing the curves ("chi-by-eye"). I agree that the plot labelled "lower SSE" looks like a worse fit than the plot labelled "bigger SSE" - which is the opposite of what we expect. But remember that what is being fitted (I assume) is the vertical distance between each yellow point and each corresponding blue point. And it is impossible to tell, by looking at these figures, whether the sum of the squared sdistances between oints is worse in the first or second plot. The slight sideways offset between the curves in the "bigger SSE" plot produces extremely large errors in the vertical direction, but our eye sees the curves as close together, because they are so close horizontally. So we cannot trust chi-by-eye.

Shaily_T il 10 Mag 2022

Thanks for your response!

Indeed, the interesting and weird part of it is it works for some of the frequencies and I get reasonable fits there but it is not working for some other frequencies.

So, even when the fit looks better for bigger SSE I should still rely on SSE. Is that correct? So, now I should find the reason why for the parameters that give the best fit the efficiency predicitons are quite different from the experimental observations.

What I have noticed in my fits is I do get inconsistent results (including this discrepancy for SSE) for some of data series . I tried 'Levenberg-Marquardt' algorithm in the curve fitting toolbox. But the result is changing each time I change the startpoints. Also, I do have 4 fitting parameters and everytime automatically it assumes two of them are fixed at startpoint and only fits the other two (It is not too bad because the two fixed parameters are more or less predictable from the fits). So, I thought it would be helpful to see the result of globalsearch and multistart in the optimization toolbox and I tried it by using different solvers and algorithms. However, I get close or similar results as the local solvers even by using multistart and globalsearch. One suggestion was to do a grid search for the global minima and so first I tried to plot the function that I want to optimize vs two of my fitting parameters (so, the other two are constant) to have a sense of what is going on. I have attached the obtained surfplot (SSE is the function to be minimized, a, d, sigma are parameters that I want to obtain by minimization). I am wondering do you think if a grid search would be helpful in my case?

I undrestand it is another question itself but I thought it might be nice if I know what are your thoughts on this.

I appreciate your time and comments!

William Rose il 10 Mag 2022

Modificato: William Rose il 23 Mag 2022

Apri in MATLAB Online

[edited 5/23/2022: Corrected a typo in the code: replaced "~sumsqerr" with "@sumsqerr"]

Levenberg-Marquardt is a method for finding the minimum. Ideally, it ill not change the minimum that is found. It just may get there in a more or less efficient way. I wrote my own L-M routine in Pascal, based on the Numerical Recipes book, in the 80s. I suspect that Matlab's fmincon() is better at incorporating contrainsts of various types. Constraints may be hard to implement in standardard Levenberg-Marguardt. Matlab's fmincon() probably has other features which L-M may lack.

You are wise to be concerned about getting stuck in a local minimum. When I do multidimensional fitting, I always start from multiple points in the N-dimensional parameter space , to improve the odds tht I find the true global minimum, and reduce the odds of ending up in a local minimum which is not a global minimum. I would try 2^N or 3^N starting points, chosen near the corners (if 2^N) or near-corners plus midpoints (if 3^N) of the N-dimensional hypercube of parameter space. With N=4 this means 81 starting points. Do the fit 81 times and remember the output of each trial. The best fit is the one with the lowest MAE or SSE or whatever you are minimizing.

Example with N=4. I am fitting parameters a, b, c, d. The bounds are [amin,amax], [bmin,bmax], etc. I want to start at points that are at 10%, 50%, and 90% of each range:

N=4;
amin=0; amax=1; bmin=-2; bmax=2; cmin=0; cmax=10; dmin=-5; dmax=5;
a0=amin+[.1,.5,.9]*(amax-amin);
b0=bmin+[.1,.5,.9]*(bmax-bmin);
c0=cmin+[.1,.5,.9]*(cmax-cmin);
d0=dmin+[.1,.5,.9]*(dmax-dmin);
p0=zeros(3^N,N); %array for initial guesses
for i=1:3
    for j=1:3
        for k=1:3
            for m=1:3
                p0(m+3*(k-1)+9*(j-1)+27*(i-1),:)=[a0(i),b0(j),c0(k),d0(m)];
            end
        end
    end
end

Code above creates a 81x4 array. Each of the 81 rows is a different starting point.

Display the first 3 and last 3 starting points:

disp(p0(1:3,:)); disp(p0(79:81,:))
1000   -1.6000    1.0000   -4.0000
1000   -1.6000    1.0000         0
1000   -1.6000    1.0000    4.0000

9000    1.6000    9.0000   -4.0000
9000    1.6000    9.0000         0
9000    1.6000    9.0000    4.0000

In this example, you would call fmincon() 81 times, using a different row from x0 as the start point, each time.

p=zeros(3^N,N); %allocate array for best-fit parameters from each trial
sse=zeros(3^N,1); %array for sum squared error from each trial
for i=1:3^N
    [p(i,:),sse(i)]=fmincon(@sumsqerr,p0(i,:),[],[],[],[],[amin,bmin,cmin,dmin],[amax,bmax,cmax,dmax]);
end
Unrecognized function or variable 'sumsqerr'.
[ssebest,ibest]=min(sse);
fprintf('Best (lowest) SSE=%.3f\n',ssebest)
fprintf('Best parameters: %.3f, %.3f, %.3f, %.3f\n',p(ibest,:))

where sumsqerr() is a function written by you that computes the quantity which you want to minimize.

Try something like that.

Shaily_T il 26 Mag 2022

Thanks @William Rose for capturing that!

William Rose il 27 Mag 2022

@Shaily_T, you're welcome.

Accedi per commentare.

Answer 2

Image Analyst il 9 Mag 2022

1
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/1714600-the-difference-of-sse-and-mae-curve-fitting-and-optimization#answer_959925

I think using MSE or SSe would tend to find a fit to minimize real outliers and get closer to the outliers, while MAE (mean absolute error or median absolute error) would tend to fit better overall but may be way off at the outlier points. This is because if you square the difference, outliers far away from the fit have a much greater influence. But I could be wrong about that. If you don't have any really bad outliers, the fits may be really close to each other.

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Shaily_T il 10 Mag 2022

Thanks for your response!

Accedi per commentare.

The difference of SSE and MAE curve fitting and optimization

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposta accettata

8 Commenti
Mostra 6 commenti meno recentiNascondi 6 commenti meno recenti

Più risposte (1)

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Vedere anche

Categorie

Tag

Community Treasure Hunt

The difference of SSE and MAE curve fitting and optimization

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposta accettata

8 Commenti Mostra 6 commenti meno recentiNascondi 6 commenti meno recenti

Più risposte (1)

1 Commento Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Vedere anche

Categorie

Tag

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

8 Commenti
Mostra 6 commenti meno recentiNascondi 6 commenti meno recenti

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti