Linear regression on training set
Mostra commenti meno recenti
I have some data that I want to divide into a training set and a validation set in order to do linear regression on the training set to find y0 and r. The training set should contain at least 50% of the data. My code so far is that below:
A=[130, 300, 400, 500, 650, 1075, 2222, 2550, 3300]';
t = [1930, 1943, 1966, 1976, 1991, 1994, 2000, 2005, 2008];
idx=randperm(numel(A))
subSet1=A(idx(1:5)) %Trainingset
subSet2=A(idx(6:end)) %Validationset
If I can assume the function is exponential and is y(t)= y0*e^rt how do I continue to plot the training set to find y0 and r?
Thankful for all help!
9 Commenti
J. Alex Lee
il 10 Set 2020
you already identified that your regression can be made into linear form, so that's already a big hint for you...
katara
il 10 Set 2020
Johannes Hougaard
il 10 Set 2020
the five t values that will correspond to the randomly chosen values are used by using the idx vector similarly to what you do for A.
A=[130, 300, 400, 500, 650, 1075, 2222, 2550, 3300]';
t = [1930, 1943, 1966, 1976, 1991, 1994, 2000, 2005, 2008];
idx=randperm(numel(A));
subSet1=A(idx(1:5)); %Trainingset
subSet2=A(idx(6:end)); %Validationset
t1 = t(idx(1:5)); %t values for Trainingset
y=log(subSet1);
c=polyfit(t1,y, 1)
r=c(1);
lny0=c(2);
y0=exp(c(2));
y2 = y0*exp(r*t);
plot(t,y2,'*')
And to apply your polyfit result you could just use polyval.
% Or you could use
y2 = exp(polyval(c,t));
plot(t,y2);
Johannes has the right approach (maybe it can be written as an answer). It can be generalized to any size dataset using
idx = randperm(numel(A));
nTrain = ceil(numel(A)/2);
% nTest = numel(A)-nTrain; % if needed
trainIdx = 1:nTrain;
testIdx = nTrain+1 : numel(A);
trainSet = [A(trainIdx); t(trainIdx)]; % assuming A and t are row vectors
testSet = [A(testIdx); t(testIdx)]; % same assumptionx
% Then proceed with fitting on the trainSet and measuring
% error on the testSet
Also note that if you're planning on using a more rigorous cross validation, use cvpartition to partition your data.
katara
il 10 Set 2020
J. Alex Lee
il 10 Set 2020
you just need to exponentiate the result of polyval (remember you took the log), and I would wager the plot you really want is
plot(t,A,'*',t,exp(polyval(c,t)))
Or if I may:
A=[130, 300, 400, 500, 650, 1075, 2222, 2550, 3300];
t = [1930, 1943, 1966, 1976, 1991, 1994, 2000, 2005, 2008];
idx=randperm(numel(A));
subSet1=A(idx(1:5)); %Trainingset
subSet2=A(idx(6:end)); %Validationset
t1=t(idx(1:5)); %t values for Trainingset
t2=t(idx(6:end)); %t values for Trainingset
y=log(subSet1);
c=polyfit(t1,y, 1)
p=polyval(c,t);
r=c(1);
y0=exp(c(2));
yMdlFn = @(t)(y0*exp(r*t));
% to evaluate on test set
yMdlTest = yMdlFn(t2)
% more comprehensive plot
figure(1); cla; hold on
plot(t1,subSet1,'*')
plot(t2,subSet2,'o')
fplot(yMdlFn,[1929,2009])
But also recommend implement Adam's generalization to arbitrarily large data sets partitioned into arbitrarily sized training and test sets (although i think the code posted doesn't work)
Image Analyst
il 10 Set 2020
If you want a log fit, use fitnlm() rather than polyfit().
J. Alex Lee
il 10 Set 2020
i would take linear least squares anywhere i can get it, including this situation. linear fitting doesn't require initial guesses and guaranteed to give a "result", and is faster. now you could use the result of the polyfit to do a nonlinear fit, if you want to define the least squares differently. But you're still left with a choice on how to define your residual anyway, so you have a lot more things to worrry about if you care to that level with nonlinear fitting.
Risposte (1)
Johannes Hougaard
il 11 Set 2020
the five t values that will correspond to the randomly chosen values are used by using the idx vector similarly to what you do for A.
A=[130, 300, 400, 500, 650, 1075, 2222, 2550, 3300]';
t = [1930, 1943, 1966, 1976, 1991, 1994, 2000, 2005, 2008];
idx=randperm(numel(A));
subSet1=A(idx(1:5)); %Trainingset
subSet2=A(idx(6:end)); %Validationset
t1 = t(idx(1:5)); %t values for Trainingset
y=log(subSet1);
c=polyfit(t1,y, 1)
r=c(1);
lny0=c(2);
y0=exp(c(2));
y2 = y0*exp(r*t);
plot(t,y2,'*')
And to apply your polyfit result you could just use polyval.
% Or you could use
y2 = exp(polyval(c,t));
plot(t,y2);
Categorie
Scopri di più su Linear Predictive Coding in Centro assistenza e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!