Nonlinear fit to multiple data sets with shared parameters

Hello all,
I need to fit a nonlinear model to several data sets simultaneously. The model has the same functional form for all sets, and the values of some model parameters are the same for all sets, but the value of at least one parameter is different for each set. For example, three exponential decay curves might have the same decay constant but a different amplitude for each data set. The global fit to all three curves would produce one decay constant and three amplitudes.
What is the best way to do this in MatLab? (Apologizing in advance for a novice question!)
ELELAB

21 Commenti

Hello all, I have similar problem. So i came across this trend. I tried replicate the two answers for @Tom Lane and @Mikael Afzelius, but i get the same error from implement the answers.
I have tried the functions in the inbuilt matlab single fitting tool and it works. So i am wondering why the functions is a problem here. I will appreciate any help. Thanks
The error is below
"Error using nlinfit>checkFunVals (line 649)
The function you provided as the MODELFUN input has returned Inf or NaN values.
Error in nlinfit>LMfit (line 596)
if funValCheck && ~isfinite(sse), checkFunVals(r); end
Error in nlinfit (line 284)
[beta,J,~,cause,fullr] = LMfit(X,yw, modelw,beta,options,verbose,maxiter);
Error in Untitled4 (line 41)
b = nlinfit(X,Y,@subfun,ones(1,2))"
The snip from one of the codes is below
x1=xlsread('Book1.xlsx', 'Sheet1', 'A1:A3026');
T11_real =xlsread('Book1.xlsx', 'Sheet1', 'B1:B3026'); % y data 1;
T11_imag =xlsread('Book1.xlsx', 'Sheet1', 'C1:C3026');
T12_real=xlsread('Book1.xlsx', 'Sheet1', 'E1:E3026');
T12_imag=xlsread('Book1.xlsx', 'Sheet1', 'F1:F3026');
T = [x1; x1; x1; x1]
Y = [T11_real; T11_imag; T12_real; T12_imag]
dsid = [ones(3026,1); 2*ones(3026,1); 3*ones(3026,1);4*ones(3026,1)];
gscatter(T,Y,dsid)
X = [T dsid];
b = nlinfit(X,Y,@subfun,ones(1,2))
tt = linspace(0,6000,50)';
line(tt,real(cos((1+1i*b(1))*(2*pi*x*0.002/(1600))*b(2))),'color','r');
line(tt,imag(cos((1+1i*b(1))*(2*pi*x*0.002/(1600))*b(2))),'color','g')
line(tt,real(1i*((840*1600)/(1+1i*b(1)))*sin((1+1i*b(1))*(2*pi*x/1600)*0.002*(b(2)))),'color','b')
line(tt,imag(1i*((840*1600)/(1+1i*b(1)))*sin((1+1i*b(1))*(2*pi*x/1600)*0.002*(b(2)))),'color','c')
function yfit = subfun(params,X)
T = X(:,1);
dsid = X(:,2);
A0 = params(1);
A1 = params(2)';
yfit = zeros(size(T));
idx = (dsid==1);
yfit(idx) = F1(X(idx),[A0 A1]);
idx = (dsid==2);
yfit(idx) = F2(X(idx),[A0 A1]);
idx = (dsid==3);
yfit(idx) = F3(X(idx),[A0 A1]);
idx = (dsid==4);
yfit(idx) = F4(X(idx),[A0 A1]);
end
function y = F1(x,b)
y = real(cos((1+1i*b(1))*(2*pi*x*0.002/(1600))*b(2)));
end
function y = F2(x,b)
y = imag(cos((1+1i*b(1))*(2*pi*x*0.002/(1600))*b(2)));
end
function y = F3(x,b)
y = real(1i*((840*1600)/(1+1i*b(1)))*sin((1+1i*b(1))*(2*pi*x/1600)*0.002*(b(2))));
end
function y = F4(x,b)
y = imag(1i*((840*1600)/(1+1i*b(1)))*sin((1+1i*b(1))*(2*pi*x/1600)*0.002*(b(2))));
end
Torsten
Torsten il 10 Ago 2023
Modificato: Torsten il 11 Ago 2023
Since you didn't include your data, we cannot tell what the problem is.
As the error message states, your function "subfun" returns Inf or NaN values.
This might be due to the data vector T or the parameters "nlinfit" supplies to "subfun".
/(1+1i*b(1))
if b(1) were 1i then that would be 1+1i*1i which would be 1+(-1) which would be 0, and you would have a division by 0.
Yeah @Walter Roberson. I thought about that. And then i remove the last two equations and data. However, the error still persisted with just the first two equations
Oluwafemi, the idea behind that error is nlinfit is trying to let you know something is going wrong. You may want to debug and see what parameters are being tried when the problem happens. But you do have the alternative to turn that error check off:
opt = statset('nlinfit');
opt.FunValCheck = 'off';
b = nlinfit(X,Y,@subfun,ones(1,2),opt)
Your functions seem inadequate to capture the behaviour of your data.
x1=xlsread('Book1.xlsx', 'Sheet1', 'A2:A3027');
T11_real =xlsread('Book1.xlsx', 'Sheet1', 'B2:B3027'); % y data 1;
T11_imag =xlsread('Book1.xlsx', 'Sheet1', 'C2:C3027');
T12_real=xlsread('Book1.xlsx', 'Sheet1', 'D2:D3027');
T12_imag=xlsread('Book1.xlsx', 'Sheet1', 'E2:E3027');
T = [x1; x1; x1; x1];
Y = [T11_real; T11_imag; T12_real; T12_imag];
dsid = [ones(3026,1); 2*ones(3026,1); 3*ones(3026,1);4*ones(3026,1)];
hold on
gscatter(T(1:2*3026),Y(1:2*3026),dsid(1:2*3026))
X = [T dsid];
%b = nlinfit(X,Y,@subfun,ones(1,2))
b = lsqcurvefit(@subfun,ones(1,2),X,Y)
Local minimum possible. lsqcurvefit stopped because the size of the current step is less than the value of the step size tolerance.
b = 1×2
-0.3012 -0.0061
tt = linspace(0,6000,50)';
plot(tt,real(cos((1+1i*b(1))*(2*pi*tt*0.002/(1600))*b(2))),'color','r');
plot(tt,imag(cos((1+1i*b(1))*(2*pi*tt*0.002/(1600))*b(2))),'color','g')
%plot(tt,real(1i*((840*1600)/(1+1i*b(1)))*sin((1+1i*b(1))*(2*pi*tt/1600)*0.002*(b(2)))),'color','b')
%plot(tt,imag(1i*((840*1600)/(1+1i*b(1)))*sin((1+1i*b(1))*(2*pi*tt/1600)*0.002*(b(2)))),'color','c')
hold off
function yfit = subfun(params,X)
T = X(:,1);
dsid = X(:,2);
A0 = params(1);
A1 = params(2);
%[A0 A1];
yfit = zeros(size(T));
idx = (dsid==1);
yfit(idx) = F1(X(idx),[A0 A1]);
idx = (dsid==2);
yfit(idx) = F2(X(idx),[A0 A1]);
idx = (dsid==3);
yfit(idx) = F3(X(idx),[A0 A1]);
idx = (dsid==4);
yfit(idx) = F4(X(idx),[A0 A1]);
%indices = find(isinf(yfit));
end
function y = F1(x,b)
y = real(cos((1+1i*b(1))*(2*pi*x*0.002/(1600))*b(2)));
end
function y = F2(x,b)
y = imag(cos((1+1i*b(1))*(2*pi*x*0.002/(1600))*b(2)));
end
function y = F3(x,b)
y = real(1i*((840*1600)/(1+1i*b(1)))*sin((1+1i*b(1))*(2*pi*x/1600)*0.002*(b(2))));
end
function y = F4(x,b)
y = imag(1i*((840*1600)/(1+1i*b(1)))*sin((1+1i*b(1))*(2*pi*x/1600)*0.002*(b(2))));
end
Thanks @Torsten and @Tom Lane. Both fixes worked.
I notice the major difference for the updated code from Torsten is the "lsqcurvefit" function. I would like to know which one is more suitable for the case. Does it mean the nonlinear regression fit method "nlinfit" is more sophisticated. I guess i would have to read more on how matlab implements them.
I intend to relax the number of eqyations in order to get a better fit. I would appreciated suggestion on alternatives to get better fit.
Thank you very much for the help
For me it seems that not the optimizer is the problem, but the functions you want to fit your data with.
Thanks @Torsten. Do you know how i can compute the goodness of fit from the code?
As said, first care that the model function resembles your data. If you get a solution using "lsqcurvefit", you can call "fitnlm" or "nlinfit" with this solution as initial guess to get derived statistical data.
I somehow relax the equation to get some fair fit. Thats why i want to check with the goodness of fit. I tried using "fitnlm" or "nlinfit", but non of them gave me a value representing goodness of fit. Could you be show me an example?
Study the output structure "mdl". It contains every information you can think of.
Yeah. @Torsten. The output structure is what i needed.
However, I only meant "nlinfit" in my lats comment which did not give the those output only.
I tried calling the "fitnlm" using this "nlinfit(X,Y,@subfun,ones(1,2))", and it gave me this error below.
Error in classreg.regr.FitObject/doFit (line 94)
model = fitter(model);
Error in NonLinearModel.fit (line 1446)
model = doFit(model);
Error in fitnlm (line 99)
model = NonLinearModel.fit(X,varargin{:});
Error in nlin_method (line 39)
b = fitnlm(X,Y,@subfun,ones(1,2))
Read here on how to extract all necessary information from the returned "b":
x1=xlsread('Book1.xlsx', 'Sheet1', 'A2:A3027');
T11_real =xlsread('Book1.xlsx', 'Sheet1', 'B2:B3027'); % y data 1;
T11_imag =xlsread('Book1.xlsx', 'Sheet1', 'C2:C3027');
T12_real=xlsread('Book1.xlsx', 'Sheet1', 'D2:D3027');
T12_imag=xlsread('Book1.xlsx', 'Sheet1', 'E2:E3027');
T = [x1; x1; x1; x1];
Y = [T11_real; T11_imag; T12_real; T12_imag];
dsid = [ones(3026,1); 2*ones(3026,1); 3*ones(3026,1);4*ones(3026,1)];
hold on
gscatter(T(1:2*3026),Y(1:2*3026),dsid(1:2*3026))
%X = [T dsid];
X = T;
%b = nlinfit(X,Y,@subfun,ones(1,2))
%b = lsqcurvefit(@subfun,ones(1,2),X,Y)
opt = statset('fitnlm');
opt.FunValCheck = 'off';
b = fitnlm(X,Y,@(b,x)subfun(b,x,dsid),[-0.3 -0.006],'Options',opt)
Warning: Some columns of the Jacobian are effectively zero at the solution, indicating that the model is insensitive to some of its parameters. That may be because those parameters are not present in the model, or otherwise do not affect the predicted values. It may also be due to numerical underflow in the model function, which can sometimes be avoided by choosing better initial parameter values, or by rescaling or recentering. Parameter estimates may be unreliable.
b =
Nonlinear regression model: y ~ F(b,x) Estimated Coefficients: Estimate SE tStat pValue ________ __________ __________ __________ b1 -0.25356 4.5646e-14 -5.555e+12 0 b2 -0.006 0.0015396 -3.8971 9.7865e-05 Number of observations: 12104, Error degrees of freedom: 12103 Root Mean Squared Error: 3.16e+03 R-Squared: -0.0263, Adjusted R-Squared -0.0263 F-statistic vs. zero model: 15.8, p-value = 6.96e-05
beta = b.Coefficients.Estimate;
tt = linspace(0,6000,50)';
plot(tt,real(cos((1+1i*beta(1))*(2*pi*tt*0.002/(1600))*beta(2))),'color','r');
plot(tt,imag(cos((1+1i*beta(1))*(2*pi*tt*0.002/(1600))*beta(2))),'color','g')
%plot(tt,real(1i*((840*1600)/(1+1i*b(1)))*sin((1+1i*b(1))*(2*pi*tt/1600)*0.002*(b(2)))),'color','b')
%plot(tt,imag(1i*((840*1600)/(1+1i*b(1)))*sin((1+1i*b(1))*(2*pi*tt/1600)*0.002*(b(2)))),'color','c')
hold off
function yfit = subfun(params,X,dsid)
T = X;
A0 = params(1);
A1 = params(2);
yfit = zeros(size(T));
idx = (dsid==1);
yfit(idx) = F1(X(idx),[A0 A1]);
idx = (dsid==2);
yfit(idx) = F2(X(idx),[A0 A1]);
idx = (dsid==3);
yfit(idx) = F3(X(idx),[A0 A1]);
idx = (dsid==4);
yfit(idx) = F4(X(idx),[A0 A1]);
end
function y = F1(x,b)
y = real(cos((1+1i*b(1))*(2*pi*x*0.002/(1600))*b(2)));
end
function y = F2(x,b)
y = imag(cos((1+1i*b(1))*(2*pi*x*0.002/(1600))*b(2)));
end
function y = F3(x,b)
y = real(1i*((840*1600)/(1+1i*b(1)))*sin((1+1i*b(1))*(2*pi*x/1600)*0.002*(b(2))));
end
function y = F4(x,b)
y = imag(1i*((840*1600)/(1+1i*b(1)))*sin((1+1i*b(1))*(2*pi*x/1600)*0.002*(b(2))));
end
Hello @Torsten. I was trying out something different with the code but did not work yet. Instead of finding the optimal two parameters b= [b(1) b(2)] from the four equations for the comined "x" values, I was trying to get 3026 "b"s for each 3026 "x" considering using the same set of equation and data assigned to each "x". Hence i will have arrays of 3026 "b"s.
I tried the for loop method, but that does work. An option is find it one by one and recording the b value manual, but that will take time because i have to do it 2000 plus times.
What's the advantage to get as many parameters as you have measurements ? For me, this is no data reduction and thus makes no sense.
the fitting parameters are not constant value but are a function of the independent variable "x". So if i can get the trend of the parameters with respect to x and, then I can easily find the equation that represent the trend
So in the end you want to find the trend of the fitting parameters, thus fit the fitting parameters ? Interesting ...

Accedi per commentare.

 Risposta accettata

Take a look at this example of a function that partitions the parameters into a set sharing values across all datasets and a set taking different values for different datasets. You may be able to modify this for your purposes.
function sharedparams
t = (0:10)';
T = [t; t; t];
Y = 3 + [exp(-t/2); 2*exp(-t/2); 3*exp(-t/2)] + randn(33,1)/10;
dsid = [ones(11,1); 2*ones(11,1); 3*ones(11,1)];
gscatter(T,Y,dsid)
X = [T dsid];
b = nlinfit(X,Y,@subfun,ones(1,5))
line(t,b(1)+b(2)*exp(-t/b(5)),'color','r')
line(t,b(1)+b(3)*exp(-t/b(5)),'color','g')
line(t,b(1)+b(4)*exp(-t/b(5)),'color','b')
function yfit = subfun(params,X)
T = X(:,1); % time
dsid = X(:,2); % dataset id
A0 = params(1); % same A0 for all datasets
A1 = params(2:4)'; % different A1 for each dataset
tau = params(5); % same tau
yfit = A0 + A1(dsid).*exp(-T/tau);
This uses nlinfit from the Statistics Toolbox, but you could use other fitting routines instead.

21 Commenti

I don't have your data, so I cannot try this, but I thought the output t from ode45 is supposed to match the input tspan. If that's the case, you can control its length.
Thanks Tom
I think I found a way to work it out. Regards Matthieu
Hi Tom,
Hoe this code can be written if instead of yfit = A0 + A1(dsid).*exp(-T/tau); ode45 would have been used with another nested function in it defining the system of ODEs?
Dear Tom, Do you mind to show how to do it with lsqcurvefit or fit? Boundaries can be applies using these functions unfortunately not nlinfit. Furthermore fit can be export directly from curve fitting toolbox. I have spent days finding a shared parameter case using lsqcurvefit or fit but no luck.
% Set up some fake data following the desired model
M = [1 3 2 7 3.4]'; % five distinct parameters
dsid = repelem((1:5)',50); % identifier of the five datasets
N = sqrt(2); D = 2; % two common parameters
F = 2*rand(250,1); % X data
% Generate y values, then plot
y = M(dsid).* ((0.0035*0.027*(2*pi*F).^2*N^2)./sqrt(((N^2)-(2*pi*F).^2).^2+(2*D*N*2*pi*F).^2));
h = gscatter(F,y,dsid);
% Define a function that matches the formula used for y
f0 = @(M,N,D,F) M(dsid).* ((0.0035*0.027*(2*pi*F).^2*N^2)./sqrt(((N^2)-(2*pi*F).^2).^2+(2*D*N*2*pi*F).^2));
% Define a function that takes a parameter vector and calls
% f0 with separate parameters
f1 = @(params,F) f0(params(1:5),params(6),params(7),F);
% Define some random starting values
rng(0);
start = rand(7,1);
% Fit
x = lsqcurvefit(f1,start,F,y)
% Overlay fits. Need to write a new function
% expecting just one M value
f2 = @(M,N,D,F) M*((0.0035*0.027*(2*pi*F).^2*N^2)./sqrt(((N^2)-(2*pi*F).^2).^2+(2*D*N*2*pi*F).^2));
xgrid = linspace(0,4);
for j=1:length(h)
hh = line(xgrid,f2(x(j),x(6),x(7),xgrid));
hh.Color = h(j).Color;
end
Tom, I have just worked out my solution based on @Richard Willey's answer below. I think your code is more friendly to be upgraded to GUI and I am going to try it now!
Post my working code for reference:
function sharedparams
ds = dataset('xlsfile', 'XXX.xlsx') ; ds.dataset = nominal(ds.dataset);
% Initial estimates for [N, D, M_C11, M_C12, M_C21 and M_C22] p0 = [120 , .01 , 100, 100, 100, 100];
% Estimate parameters fn = @(p , f ) objFcn(p , f , ds.dataset) ;% Function handle
lb = [100 , 0 , 0, 0, 0, 0]; ub = [180 , 0.3 , 300, 300, 300, 300]; options = optimoptions('lsqcurvefit'); options.TolFun = 1e-10;
pFit = lsqcurvefit(fn , p0 , ds.frequency , ds.magnitude,lb, ub, options);
assignin('base','pFit',pFit);
end
function yfit = objFcn(p , f, dataset)
N = p(1) ; % location-independent parameter 1 D = p(2) ; % location-independent parameter 2
M1 = p(3) ; % Location dependent parameter (C11) M2 = p(4) ; % Location dependent parameter (C12) M3 = p(5) ; % Location dependent parameter (C21) M4 = p(6) ; % Location dependent parameter (C22)
f1 = 2*pi*f(dataset == 'A'); yfit_C11 = (0.0035*0.027*f1.^2*N^2)./(M1*sqrt((N^2-f1.^2).^2+(2*D*N*f1).^2));
f2 = 2*pi*f(dataset == 'B'); yfit_C12 = (0.0035*0.027*f2.^2*N^2)./(M2*sqrt((N^2-f2.^2).^2+(2*D*N*f2).^2));
f3 = 2*pi*f(dataset == 'C'); yfit_C21 = (0.0035*0.027*f3.^2*N^2)./(M3*sqrt((N^2-f3.^2).^2+(2*D*N*f3).^2));
f4 = 2*pi*f(dataset == 'D'); yfit_C22 = (0.0035*0.027*f4.^2*N^2)./(M4*sqrt((N^2-f4.^2).^2+(2*D*N*f4).^2));
% Combine prediction yfit = [yfit_C11; yfit_C12; yfit_C21; yfit_C22] ;
end
Hi Tom, I have a similar problem and was reading your answer. I have an issue in understanding the role of dsid and what A1(dsid) does?
dsid is "data set identifier". Effectively class number.
I really appreciate your answer. It helps my question a lot. Thanks!
Nice, thank you. Would it be possible to also include a weighting of the data in the fit, for instance if each datapoint has a different errorbar?
Hello Tom,
Thanks for both your answers. They are very instructive. However, how can one solve this problem in the case where the datasets are non-uniform in size. Further, is this possible using the fit function (so that I can add weights) because I am trouble passing the parameters as outlined for lsqcurvefit function.
Ajay
Ajay, I will post two additional answers that elaborate on the accepted answer.
LeChat, there is a 'Weight' parameter that nlinfit accepts. I would expect no problem using that parameter with this approach.
Hello,
I tried to adapt this solution to my fitting problem, but it results in an error message concerning the vector dimensions:
Error using nlinfit (line 219)
MODELFUN must be a function that returns a vector of fitted values the same size as Y (20-by-1). The model function you provided returned a result that was 20-by-20.
One common reason for a size mismatch is using matrix operators (*, /, ^) in your function instead of the corresponding
elementwise operators (.*, ./, .^).
I double checked my whole code but wasn't able to find, what's causing this error.
function sharedparams
B = [20 ; 50 ; 100 ; 160];
m = [1 5 9 13 17];
v_i = [-80 -90 -100 -110 -120];
P_500_4 = [6.2232 ; 8.3944 ; -27.211 ; -39.732 ; 8.1175 ; ...
7.8778 ; -49.358 ; -58.215 ; 8.9701 ; -27.335 ; ...
-66.292 ; -73.315 ; -25.996 ; -47.529 ; -81.03 ; ...
-86.631 ; -53.607 ; -70.278 ; -94.482 ; -97.006];
dsid = [];
B_conc = [];
v_i_conc = [];
v_r_conc = [];
for i = 1:length(B)
dsid = [dsid ; i * ones(length(v_i),1)];
B_conc = [B_conc ; B(i) * ones(length(v_i),1)];
v_i_conc = [v_i_conc ; v_i'];
v_r_conc = [v_r_conc ; P_500_4(m-1+i,6)];
end
X = [v_i_conc , dsid];
Y = v_r_conc;
f = nlinfit(X,Y,@subfun,ones(1,6));
figure()
hold on
gscatter(X(:,1),Y,B_conc)
line(v_i,f(1).*(vi.^f(2)-f(3).^f(2)).^(1./f(2)),'color','r')
line(v_i,f(1).*(vi.^f(2)-f(4).^f(2)).^(1./f(2)),'color','g')
line(v_i,f(1).*(vi.^f(2)-f(5).^f(2)).^(1./f(2)),'color','c')
line(v_i,f(1).*(vi.^f(2)-f(6).^f(2)).^(1./f(2)),'color','v')
function vrfit = subfun(params,X)
vi = X(:,1);
dsid = X(:,2);
% Parameters
a = params(1); % same a for all curves
vbl = params(2:2+length(unique(dsid))-1); % different vbl for each curve
p = params(6); % same p for all curves
% Fit-Function
vrfit = a.*(vi.^p - vbl(dsid).^p).^(1/p);
I would appreciate if anyone could give me a hint or even solve this issue. Thanks in advance.
I couldn't run your code. On the last line of your for loop, you seem to be indexing into column 6 of an array that has just a single column.
However, this type of issue can often be the result of operating on two vectors with different orientations, for example:
>> [1;2;3].^[1 2 3]
ans =
1 1 1
2 4 8
3 9 27
If you expect elementwise operations between two vectors, make sure they are both rows or both columns. In your case, try a break point or display to make sure vi and vbl(dsid) match this way.
Hello and thanks so far for your quick response! Indeed the index 6 is obsolet as I already extracted the necessary column of that array for the above code. Indeed the vectors had not the same orientation what results in a matrix. So far I thought that this would only be an issue for operations "*" and "/".
Anyway, thanks for your help! Now it works.
This is very neat, thank you @Tom Lane.
Actually, how should I proceed if I need to constraint the value of my fit parameters? I need that nlinfit does not allow one to impose upper and lower bounds...
Thank you!
Hi Tom, thanks for the very nice solutions and your explanations.
I have a model function that includes an ode45 function (inputs are myode function, tspan, a few parameters shared among all data sets, and a few parameters that change for each data set). However, in your approach, xdata repeats for the number of data sets. In your example,
t = (0:10)';
T = [t; t; t];
T is cycling from 0 to 10 for three times. If I use T as the input for my ode45 function, I see an error "The entries in tspan must strictly increase or decrease," which is understandable. Is there any work around this problem?
Take a look at my entry from 29 May 2020, later edited on Jun 2020. In that comment I have code that calls a different function F1, F2, F3 for each group. I don't completely understand your problem, but I wonder if this technique is what you need. Would it make sense for you to have separate calls for ode45, one for each set of increasing times?
yoonys
yoonys il 17 Set 2022
Modificato: yoonys il 17 Set 2022
@Tom Lane I revisited this problem and your version on Jun 2020 worked for me! I think the problem with this version (Dec 2012) was that yfit is a function of T, which is cycling from 0 to 10 several times in my example (not a monotonic function), which isn't compatible with ode45. With separate functions F1, F2, F3 for each group, it worked. Thank you!

Accedi per commentare.

Più risposte (15)

Hello all,
There is a convenient wrapper code for NLINFIT available on File Exchange that achieves exactly this. Very easy to use, check out the example in the help. I have used it succesfully to fit multiple data sets with models that share some (but not all) parameters. This is a common scenario in experimental physics, and it is surprising that Matlab doesnt support this natively.
Mikael

1 Commento

Hello all,
There's a wrapper code for LSQNONLIN available on the file exchange as well. I haven't modified it for my case yet but the example included in the code worked perfectly. Hopefully I will remember to update on if I had any troubles.
I wish to second Mikael's comment that this is indeed a common scenario in experimental physics and it's dissapointing that Matlab doesn't have this covered.

Accedi per commentare.

Normaly you could identify both parameters in a fit using lsqcurvefit (least square fit) and a function of the type:
function F=myfun(x,xdata)
F=x(1)*exp(-1*x(2)*xdata)
then
[x,resnorm] = lsqcurvefit(@myfun,StartingValues,xdata,ydata,LowerBound,UpperBound,options);
x will contain amplitude and decay rate that best match your data starting at StartingValue between LowerBound and UpperBound.
You can then appreciate that your 3 datasets have roughly the same decay rate and different amplitudes, OR:
If you know in advance which parameter will be shared, You can use lsqcurvefit with a function that uses a global variable as your shared parameter.
function F=myfunFixedAmp(x,xdata)
global A
F=A*exp(-1*x(1)*xdata)
or a different finction for a fixed decay rate
function F=myfunFixedDecay(x,xdata)
global Alpha
F=x(1)*exp(-1*Alpha*xdata)

1 Commento

Thanks, but I don't think this will do what I need. In the first example code you give (below) the amplitude is fixed and the decay constant is variable, but I need the amplitude to be a variable fitting parameter (which has the same value for all data sets) and the decay constant also to be a variable fitting parameter but it has a different value for each data set.
function F = myfunFixedAmp(x,xdata)
global A
F=A*exp(-1*x(1)*xdata)
Thanks,
ELELAB

Accedi per commentare.

(Wasn't sure if this should be a "comment" or "answer" so posting it as both)
Thanks, but I don't think this will do what I need. In the first example code you give (below) the amplitude is fixed and the decay constant is variable, but I need the amplitude to be a variable fitting parameter (which has the same value for all data sets) and the decay constant also to be a variable fitting parameter but it has a different value for each data set.
function F = myfunFixedAmp(x,xdata) global A F=A*exp(-1*x(1)*xdata)
So if there were three data sets to be fit, there would be four variable parameters: A (common to all three sets), and three decay constants, one for each set. One then minimizes the sum of the squared residuals of all three data sets.
Thanks again, ELELAB
j dr
j dr il 6 Apr 2011
Like I said in the top half of my post, if you use lsqcurvefit on each of the 3 curves independently, you would get:
A1,Alpha1 A2,Alpha2 A3,Alpha3
You could then verify if all "A"s are similar or all "Alpha"s are similar (by checking the standard deviation or the standard deviation divided by the mean to have a relative value of which of "A"s of "Alpha"s are more similar).
If this does not solve your problem I think you might need to restate it.
Hi Kenneth
Coincidentially, I did a webinar a couple weeks back that uses lsqcurvefit to solve just this type of problem.
You can download the code from http://www.mathworks.com/matlabcentral/fileexchange/30869-fitting-with-matlab-statistics-optimization-and-curve-fitting
(You want the analysis.m example)
The webinar was titled "Fitting with MATLAB: Statistics, Optimization, and Curve Fitting". The recorded webinar should be up on the Statistics Toolbox product page shortly

1 Commento

Dear Richard, I have gone thro your webinar and download your code. Could not find the tObs function called in Objfn.m and also showing this error "Error using sbioloadproject (line 66) Unable to read file 'Tumor_Growth.sbproj': No such file or folder. "

Accedi per commentare.

DiRa
DiRa il 11 Dic 2012
Dear all, it's been some time now since the last answer was posted, but I'm having almost the same problem now. I thought the last answer from Richard would help me, but it actually didn't. So here's my problem:
I have about 200 2D-datasets, where the x-scale shows the time(t)-dependence.
I need to fit a nonlinear exponetial decay function onto each one of these 200 datasets, that looks in principal like this:
y = y0 + A1*exp(-t/tau1) + A2*exp(-t/tau2) + A3*exp(-t/tau3)
The important thing is now that tau1,tau2 and tau3 must be the same for all 200 datasets and y0, A1, A2 and A3 may vary, but all of them have to be fitted by a lsqcurvefit at once. So the variables tau have to be constant for all datasets in the end, but is an unknown variable during the lsqcurvefit and should be optimized by the fit.
To specify it again: I want to do a lsqcurvefit for the variables y0,A1,A2,A3,tau1,tau2,tau3 for every single 2D dataset and in addition i need to do a global fit for tau1,tau2,tau3.
Is there a nice and fast way to calculate this kind of fit in matlab?
When I tried to fully understand the analysis.m i had the problem that it is not running after downloading all files because the is missing a needed function or script.
Thanks alot!
So if I understand correctly, you want to fit tau1, tau2, tau3 to all the data, but the other parameters are fit on an individual dataset basis? If so, I think this solution of breaking this up into a linear lsq problem embedded in a nonlinear lsq problem would work:
Identify tau1, tau2, tau3 for all of the data sets with lsqcurvefit. The other parameters y0, A1, A2, A3 can be identified on an individual basis inside the lsqcurvefit objective function (and this will be pretty fast because it's a linear least squares problem to solve for these).
Lsqcurvefit will be modifying the values for tau1, tau2, tau3, and then calling your objective function. Your objective function will:
1) Loop over the 200 datasets:
  • solve a linear least-squares problem to identify y0,A1,A2,A3 for each dataset (this can be done with "x = A\b", which returns the least-squares soln for underdetermined systems: http://www.mathworks.com/help/matlab/ref/mldivide.html).
  • calculate the value of your function at each point for each dataset: A*x
2) Combine all of the ydata from the 200 runs to be passed back to lsqcurvefit.
This approach may be kind of time consuming (it has to do 200 matrix solves each time the objective function is calculated), depending on the size of your datasets. But, it should be quicker and give you a better answer than trying to solve this as one large nonlinear curve fitting problem.
DiRa
DiRa il 12 Dic 2012
Dear Seth and Tom,
I want to thank you both for your fast response. I used Toms suggestion since my script was already looking like that in some parts.
Since I needed some lower and upper bounds I then used the lsqcurvefit. After fixing some memory problems it then worked perfectly.
Thanks alot again!
Yi
Yi il 18 Apr 2013
@Kenneth: I understand precisely what your problem is and I am dealing with very similar situation: fitting 2D Gaussian function to hundreds of small pieces of ROI cut out from a large image, with the same global sigma, but different intensities and centroids. In fact in Igor, there is a built-in function called "global fit" that does the job. But honestly I don't know any elegant way to do the same in Matlab.
@Tom: Your solution will work for small set of data, like Kenneth's three exponential functions. But In my problem, I am dealing with greater than 500 sets of data at the same time, which would mean thousands of parameters to optimize at a time. But most of the parameters from different ROI don't "crosstalk", ie the Jacobian matrix is extremely spars. I am not sure if Matlab is smart enough to figure that out.
The workaround solution I can think of for large number of data sets like for my case is the E-M approach, which is to deal the common global parameter and other local parameters separately. But it can be slow too...
Julie
Julie il 14 Mag 2014
Modificato: Walter Roberson il 29 Mag 2020
Dear fellow Matlab users I have a similar problem but can't seem to solve the issue with the above suggestions:
I am trying to use nlinfit to fit 14 sets of data to obtain a common exchange rate while a second scaling variable is also allowed to vary during the fit and should be different for each data set. I load a data file (f1) where the first row contains a 0 followed by my independent variables (24 of them) and each subsequent row has in column 1 a variable corresponding to the data in that row (ligand concentration) followed by the dependant variable (intensity) for the corresponding independent variables in row 1. I am getting the following error:
Error using nlinfit (line 122) Requires a vector second input argument.
Error in fit_lineshape_exchange (line 42) [beta,R]=nlinfit(x,y,@lineshape_exchange, beta0);
my function is below:
function f = lineshape_exchange(a)
%a(1) is k (the rate), a(2) is c (the weighting function allowing each
%spectrum to vary in intensity)
global K L v1 v2 Tp1 Tp2 w n
K=0.0000033;
Tp1=4.96;
Tp2=7.25;
v1=12.212;
v2=13.114;
f=0;
for j=1:n
t=1/(a(1*L*(1+K)));
p1=K*a(1)*L*t;
p2=1-p1;
W=v1-v2;
P=t*((1/Tp1*Tp2)-4*pi^2*w^2+pi^2*W^2)+(p1/Tp1)+(p2/Tp2);
Q=t*(2*pi*w-pi*W*(p1-p2));
R=2*pi*w*(1+t*((1/Tp1)+(1/Tp2)))+pi*W*t*((1/Tp1)-(1/Tp2)+pi*W*(p1-p2));
f=a(2)*(P*(1+t*(p2/Tp1 + p1/Tp2))+Q*R)/(P^2+R^2);
end
Silvina Pugliese
Silvina Pugliese il 11 Giu 2018
Modificato: Silvina Pugliese il 11 Giu 2018
Hi everyone,
I have used the idea from Tom Lane here (thank you very much, Tom!) to make a code that is general to any number of datasets. The code as it is now, generates some datasets for testing, you will need to load your data and change the fitting function to your fitting model. Best regards, Silvina. Link: https://github.com/SilvinaPugliese/Global-fitting-in-MATLAB
Hi everyone!
I used the code from @Richard, thanks a lot!
My question is: i have 10 datasets at 10 different known temperatures, so i don't need to fit the temperature. Then I have an equation with 3 unkown parameters and i have to fit the 10 curves with a different temperature and the same unknown parameters. How can i modify the code?
Thanks a lot
Leonardo
My posted answer from 11-Dec-2012 shows how to deal with multiple datasets that are roughly similar. That isn't necessary, though. Here is a variation of that answer where the X values are not common across groups and aren't even of the same size.
function sharedparams
% Set up data so that Y is a function of T with a specific functional form,
% but there are three groups and one parameter varies across groups.
t1 = (0:10)'; t2 = (0:2:10)'; t3 = (2:9)';
T = [t1; t2; t3];
Y = 3 + [exp(-t1/2); 2*exp(-t2/2); 3*exp(-t3/2)] + randn(size(T))/10;
dsid = [ones(size(t1)); 2*ones(size(t2)); 3*ones(size(t3))];
gscatter(T,Y,dsid)
% Pack up the time and dataset id variables into X for later unpacking
X = [T dsid];
b = nlinfit(X,Y,@subfun,ones(1,5))
tt = linspace(0,10)';
line(tt,b(1)+b(2)*exp(-tt/b(5)),'color','r')
line(tt,b(1)+b(3)*exp(-tt/b(5)),'color','g')
line(tt,b(1)+b(4)*exp(-tt/b(5)),'color','b')
function yfit = subfun(params,X)
T = X(:,1); % unpack time from X
dsid = X(:,2); % unpack dataset id from X
A0 = params(1); % same A0 for all datasets
A1 = params(2:4)'; % different A1 for each dataset
tau = params(5); % same tau
yfit = A0 + A1(dsid).*exp(-T/tau);
Tom Lane
Tom Lane il 29 Mag 2020
Modificato: Tom Lane il 5 Giu 2020
My posted answer from 11-Dec-2012 shows how to deal with multiple datasets following roughly the same functional form with parameters that vary from one dataset to the other. That isn't necessary, though. Here is a variation of that answer where the functional forms differ across the datasets. They share some parameters, but others are distinct to each dataset.
function sharedparams2
% Set up data so that Y is a function of T with a different functional
% form across three groups. The groups share a common constant parameter
% and a constant time scale parameter, but otherwise follow a different
% functional relationship.
rng default
t1 = linspace(0,10,40)'; t2 = linspace(0,10,25)'; t3 = sort(2+7*rand(15,1));
T = [t1; t2; t3];
Y = [F1(t1,[3 2 1]); F2(t2,[3 1 1]); F3(t3,[3 2 1])];
Y = Y + randn(size(Y))/5;
dsid = [ones(size(t1)); 2*ones(size(t2)); 3*ones(size(t3))];
gscatter(T,Y,dsid)
% Pack up the time and dataset id variables into X for later unpacking
X = [T dsid];
b = nlinfit(X,Y,@subfun,ones(1,5))
tt = linspace(0,10)';
line(tt,F1(tt,b([1 2 5])),'color','r')
line(tt,F2(tt,b([1 3 5])),'color','g')
line(tt,F3(tt,b([1 4 5])),'color','b')
end
function yfit = subfun(params,X)
T = X(:,1); % unpack time from X
dsid = X(:,2); % unpack dataset id from X
A0 = params(1); % same A0 for all datasets
A1 = params(2:4)'; % different A1 for each dataset
tau = params(5); % same tau
yfit = zeros(size(T));
% Find separate groups and call the F function for each group
idx = (dsid==1);
yfit(idx) = F1(X(idx),[A0 A1(1) tau]);
idx = (dsid==2);
yfit(idx) = F2(X(idx),[A0 A1(2) tau]);
idx = (dsid==3);
yfit(idx) = F3(X(idx),[A0 A1(3) tau]);
end
% Below are the three functions for the three groups
function y = F1(x,b)
y = b(1) + b(2)*exp(-x/b(3));
end
function y = F2(x,b)
y = b(1) + b(2)*sin(x/b(3));
end
function y = F3(x,b)
y = b(1) + b(2)*cos(x/(2*b(3)));
end

5 Commenti

@Tom Lane, thanks for the nice answer. I tried you method to my own data but encoutner some difficulty. Would you provide some thought how I might be able to solve it?
I have obtained 2 data curves (y1 and y2) from my experiment. Both curves have the same x.
For each curve, they should be described by their own base data set.
y1 = a* bd1_y1 + b* bd1_y2 + (1-a-b)* bd1_y3
y2 = a* bd2_y1 + b* bd2_y2 + (1-a-b)* bd2_y3
, where
bd1_y1, bd1_y2, and bd1_y3 are the base set data for y1
bd2_y1, bd2_y2, and bd2_y3 are the base set data for y2
a and b are the relative contributions (0<a,b<1) of base data set and are shared.
But after working on it for days, I still don't know how to incorporate the base set into the function (f0), and thus no luck extracting a and b.
How can I fit the y1 and y2 to obtain a, b, and their confidence interval? Your help is greatly appreciated.
Data:
x = [1 2 3 4 5 6 7 8 9 ];
y1 = [0.4304 0.2249 0.1283 0.0794 0.0484 0.0326 0.0203 0.0125 0.0072];
y2 = [0.2179 0.1699 0.1410 0.1101 0.0871 0.0679 0.0515 0.0385 0.0296];
bd1_y1 = [0.5587 0.2244 0.1023 0.0520 0.0276 0.0155 0.0089 0.0053 0.0033];
bd1_y2 = [0.4580 0.2788 0.1198 0.0642 0.0342 0.0197 0.0115 0.0069 0.0043];
bd1_y3 = [0.3584 0.3102 0.1540 0.0755 0.0440 0.0248 0.0148 0.0091 0.0056];
bd2_y1 = [0.3266 0.1778 0.1255 0.0975 0.0777 0.0612 0.0478 0.0367 0.0281];
bd2_y2 = [0.2985 0.2086 0.1268 0.0939 0.0722 0.0580 0.0470 0.0383 0.0313];
bd2_y3 = [0.2451 0.2221 0.1434 0.0999 0.0775 0.0609 0.0494 0.0406 0.0335];
Is it not possible to concatenate the two datasets together and fit this as one problem?
X = [x x]; Y = [y1 y2]; B1 = [bd1_y1 bd2_y1]; % and so on
Furthermore, I would think you could fit this as a linear problem
y1 = a* bd1_y1 + b* bd1_y2 + (1-a-b)* bd1_y3
Same as
y1-bd1_y3 = a*(bd1_y1-bd1_y3) + b*(bd1_y2-bd1_y3)
Y3 = a*X13 + b*X23
This looks like a regular linear model with no intercept , response Y3=y1-bd1_y3, and two predictors also defined from your original variables. Sorry if I am missing the reason this won't work. One reason might be the need to constrain a and b to be between 0 and 1.
Dear @Tom Lane, thanks for the fast reply.
Yes, I can concatenate the two datasets togehter so that
X = [x x]; Y = [y1 y2]; B1 = [bd1_y1 bd2_y1]; B2 = [bd1_y2 bd2_y2]; B3 =[bd1_y3 bd2_y3];
Do you mean I will need to modify the F1, F2 F3 in your sharedparams2 function?
On the other hand, if I take the linear model route, what function would you suggest to perform the fit and obtain the ci with constrain (0<a,b<1)?
Hi, Tai-Yen Chen, refer to the resuluts below:
Root of Mean Square Error (RMSE): 0.0120651314576154
Sum of Squared Residual: 0.00262021314761172
Correlation Coef. (R): 0.997204978347107
R-Square: 0.994417768840254
Parameter Best Estimate
-------------------- -------------
a 1.97124402185668
b -3.16325772339774
@Alex Sha, thanks for the reply. but the a and b need to be a value between 0 and 1 though.

Accedi per commentare.

@Tai-Yen Chen, if you concatenate, then the sharing is built in. You don't have to account for different subsets of the data following different models.
I don't think anything I wrote easily extends to imposing constraints on the parameters. I don't have a good idea of how you might use nlinfit to do this. If it turns out the estimates satisfy the constraints, then you are all set. If not, then the constraints would place one or both of a and b on the boundary. You could perhaps regard that parameter as fixed and solve for the other. For example, if you get a=-0.1 try setting a=0 (omitting the term with a) and solve only for b. That could possibly give you an estimate and confidence interval for b under the assumption that a=0.
I know Optimization Toolbox has functions like fminunc that can impose constraints on the parameters. That would not produce confidence intervals, though. Perhaps you could use bootstrapping.

1 Commento

I am trying to wright a MATLAB program to perfrom nonlinear regression accross multiple datasets (in a similiar manner). But in my question I have 3 unknown shared parameters and 1 known parameters whose value will be different for each dataset.
Unfortuanately I am getting an error (something to do with deviations between matrix dimiension of Y and the model function I provided). See below error. I don't see where I'm going wrong as I have followed your methodology exactly and have debugged each line without finding a solution. If you could point out my mistake I would realy appreciate it. Thanks.
ERROR:
Error using nlinfit (line 219)
MODELFUN must be a function that returns a vector of fitted values the same size as Y (160-by-1). The model function you provided returned a result that was 160-by-160.
One common reason for a size mismatch is using matrix operators (*, /, ^) in your function instead of the corresponding elementwise operators (.*, ./, .^).
Error in untitled (line 18)
b = nlinfit(X,Y,@subfun,ones(1,3))
CODE:
clc
clear all
close all
t = xlsread('data.xlsx', 'Sheet1', 'F4:F43'); % x data
y1 = xlsread('data.xlsx', 'Sheet1', 'G4:G43'); % y data 1
y2 = xlsread('data.xlsx', 'Sheet1', 'M4:M43'); % y data 1
y3 = xlsread('data.xlsx', 'Sheet1', 'AG4:AG43'); % y data 1
y4 = xlsread('data.xlsx', 'Sheet1', 'AM4:AM43'); % y data 1
T = [t; t; t; t] % vector of x data sets
Y = [y1; y2; y3; y4] % vector of y data sets
dsid = [ones(40,1); 2*ones(40,1); 3*ones(40,1);4*ones(40,1)] % ones(40,1) is a 40 x 1 array of 1's
gscatter(T,Y,dsid)
X = [T dsid] %%%%%%%%
tau = [63085.05; 1525.3; 1601.8; 62465.3]
b = nlinfit(X,Y,@subfun,ones(1,3))
line(t,b(1) * log(1 + t/tau(1)) + b(2) * log(1 + (t/tau(1))*(1/b(3))), 'color', 'r');
line(t,b(1) * log(1 + t/tau(2)) + b(2) * log(1 + (t/tau(2))*(1/b(3))), 'color', 'g');
line(t,b(1) * log(1 + t/tau(3)) + b(2) * log(1 + (t/tau(3))*(1/b(3))), 'color', 'b');
line(t,b(1) * log(1 + t/tau(4)) + b(2) * log(1 + (t/tau(4))*(1/b(3))), 'color', 'c');
function yfit = subfun(param,X)
T = X(:,1) % time
dsid = X(:,2); % dataset id
A1 = param(1); % unkown paramater (common to each dataset)
A2 = param(2); % unkown paramater (common to each dataset)
gamma = param(3); % unkown paramater (common to each dataset)
tau = [63085.05; 1525.3; 1601.8; 62465.28756]; %known paramter (unique to each dataset)
yfit = A1 * log(1 + T/tau(dsid)) + A2 * log(1 + (T/tau(dsid))*(1/gamma));
end
I also posted the question here:

Accedi per commentare.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by