MATLAB Answers

integer parameter in lsqcurvefit

11 views (last 30 days)
gujax
gujax on 13 Jan 2013
Hi, I am trying perhaps something foolish but let me pose the question anyway. I am trying to fit a data whose model is an unknown sum of gaussians. I would like to try starting with one Gaussian term and increase the number of terms up until lsqcurvefit provides the best fit i.e., let lsqcurvefit decide for me the number of terms to be used and then let it fit its parameters too.
I do the following:
p0=[n zeros(1,100)]
G=lsqcurvefit(@FUN,p0,x,y)
function FUN(p0,x)
for i=1:round(p0(1))
build the expression f=[f 'f1(i)+f2(i)...']
eval(string expression f)
return
Note that I have to provide some large number of zero initial parmeters even though number of parameters may be much small i.e., 20 or so (3 parameters per Gaussian term for say 7 Gaussian terms). I realize I cannot vary the number of initial parameters when it is lsqcurvefit which decides that. So I am helpless here..but it works so far.
I expect the lsqcurvefit to search for most effective 'n' i.e., number of Gassian terms which describe my data and then fit it to get the 20 parameters even though their start value is initialized to zero.
However, I see that 'n' always remains at 1 or whatever lowerbound I give to that parameter. I realized that lsqcurvefit may not be searching the integer parameter space. And moreover not sure how xtol would now behave. I am quite perplexed.
I could try comparing residual after each term and decide with my own logic to arrive at the number of terms rather than let lsqcurvefit do it for me. But I was looking at easier route and moreover I wasn't sure if my brute force idea is good or efficient.
Any suggestions here would be welcome, Thank you

  0 Comments

Sign in to comment.

Accepted Answer

Matt J
Matt J on 13 Jan 2013
Edited: Matt J on 13 Jan 2013
Only solvers in the Global Optimization Toolbox have discrete parameter estimation capabilities.
Why not express FUN in terms of a fixed n and repeat the estimation, i.e., the calls to lsqcurvefit, within a loop over n until the estimate satisfies a performance criterion that you like?

  3 Comments

Matt J
Matt J on 13 Jan 2013
Also, are they any other unknown parameters? If not, this is a linear estimation problem. Just make a design matrix whose columns are your 20-30 possible Gaussians and solve the linear system. If any of the Gaussians are unneeded, the linear solver will just return zero/neglible coefficients for those Gaussians. Select the non-neglible coefficients by thresholding.
gujax
gujax on 13 Jan 2013
Thanks Matt, I am working on a dataset which have unknown number of terms -I just gave an example of 20. However number of parms per term will be same though the parameters are unknown. I have to fit the data to get the parameters. I agree that to begin with I could try a large number of terms and lsqcurvefit should be giving zeros for those terms which are not required. Data is also noisy. Therefore, why burden the algorithm with so many terms when there may not be so many for certain experiments. But I will think of your suggestions and see how can I implement those.
My data arises from a distribution of particle sizes. I am trying to get this distribution of sizes. It changes with time. Initially it is one to two distributions and then they break into fragments to form multiple gaussian dist of sizes. How many is what I am trying to find out in discrete time steps. Time steps could be large e.g., days.. unknown Parameters are size, width and total particle number for each Gaussian.
Matt J
Matt J on 13 Jan 2013
Therefore, why burden the algorithm with so many terms when there may not be so many for certain experiments.
I still don't have a clear picturea of what other parameters, if any, there are in your system. However, if the Gaussians are all of the same known width, you may as well view ydata as the output of a Gaussian filter. The inverse filter can be implemented in closed form, with something more efficient than lsqcurvefit.
If nothing else, this would at least give you an approximate initial solution, which would help you decide which Gaussians make an important contribution and which can be discarded. It would also give you an initial guess to use in LSQCURVEFIT.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by