Optimization using lsqnonlin on very distinct data sets that depend on the same variables

Niels

20 Ott 2016

0 Risposte

Aggiornato 24 Ott 2016

9 Visualizzazioni (30 giorni)

Accedi per rispondere a questa domanda.

Follow Question

Accedi per rispondere a questa domanda.

Follow Question

Mostra commenti meno recenti

0 voti

I am working with some large data sets ( N rows of data with 1 parameter varied, each consisting of M points) for which it is assumed that there exists a function that is able to accurately describe each of these rows of data. This function consists of P fit parameters and the one that I vary.

Now, M is a very large number and I cannot afford to use my fitting routine on all N rows of data. Fortunately, my fitting function can be integrated, such that I can instead consider the much smaller single-row data set consisting of just N points.

Getting a nice fit through the integrated quantities goes fast and gives me physically realistic values for my P fit parameters. However, when I then plug in the fit parameters in my original function to compare it to one of the N rows of M points, the result can be way off...

So what I now want to do is make a routine where I consider e.g. 2 out of my N rows, as well as the integrated data for my fitting routine. I tried to simply concatenate everything, but the values and numbers of points may differ significantly and in the end I get similar results as when I consider just a single row of M points at the cost of a slower routine.

How can I realize this combined fitting routine and make everything equally important, independent of the big difference in N and M?

4 Commenti
Mostra 2 commenti meno recenti Nascondi 2 commenti meno recenti

Niels il 21 Ott 2016

Apri in MATLAB Online

Hi Matt,

My model is basically of the form y( t, k ) = f( t, k, p_0,...,p_n ).

As t is a very large array, it is computationally very expensive to apply lsqcurvefit to all k:

 k = 1;
 p_out = lsqcurvefit( @(p,t) f(p,t,k), p_init, t, y, p_min, p_max, fit_opts);

The above may already take up to 5 minutes to get a decent fit. Solving for all k at once is something my computer does not like at all.

So instead I am using something like

p_out = lsqcurvefit( @F, p_init, k, Y, p_min, p_max, fit_opts);

where Y( k ) = F( k, p_0,...,p_n ), which is simply y (numerically) integrated over t and F is the integral of f over t from t_0 to infinity.

Now I want to get to some intermediate form where I fit Y( k ) vs F( k ), then plug my p_out into f for e.g. k=1 and k=20 and compare these results to y. I can do both of these separately, but I am stuck at getting my p values to converge to something that gives equally good results in both Y vs F and y(t,k=1) vs f(t,k=1) and y(t,k=20) vs f(t,k=20) due to length(k) << length(t).

Matt J il 21 Ott 2016

Yes, but this is basically a restatement of your original question. What does f(t,k,p0,..pn) look like and how large is length(t) and length(k)? Without knowing that, we have no way of making informed recommendations.

Niels il 24 Ott 2016

Apri in MATLAB Online

My apologies. Hopefully the following will give you a better idea:

length(t) can be anything between 5e4 and 5e6, length(k) varies roughly from 10 to 50 and the y(t) I fit against can be quite noisy, but the integrated quantities do not suffer from that and reproduce very well if I do not vary my k.

In f(t,k,p_0,...,p_n), I basically take a summation of N individual contributions in a form similar to below snippet:

function y = f(p,t,k)
% ... Some input checking to make sure all the inputs 
%        are fed with the correct dimensions...
% size(t) = [M,1];
% size(p) = [1,2*N+1];
% k is a scalar
%%Split up the input parameters
N = (length(p)-1)/2;
N1= 1:N;
a = p( 0*N + N1 );
b = p( 1*N + N1 );
c = p( end );
%%Vectorized calculation of the output using simple matrix multiplications.
y = exp( -( (t-k) * (1./(a+N1*c)) ).^2 ) * b.';

In reality, f is more complicated with a significantly larger number of input parameters, but it is simple enough to be able to expand and perform summation with just some permutations and an occasional bsxfun. F(k,p_0,...,p_n) is simply the definite integral of f w.r.t. t from g(k) to infinity.

Accedi per commentare.

Accedi per rispondere a questa domanda.

Follow Question