Piecewise Polynomial fitting for data
Mostra commenti meno recenti
I need to use curve fitting on a time series data, as the data is large (a week), fitting a single polynomial curve will not represent the true data. Therefore, solution i can come up with (i don't know if there exists such a solution) I have to fit the data for first 24 hours (a reading at every half hour so 48 data points during a day) and keep the loop running for all 7 days (the data is in a single file). I have tried to code it but am getting error and secondly i cannot understand how to save results for all loops as i will need to reconstruct the fitted curve so i need the data and i will need the RMSE for each curve. Also is there a way to determine which polynomial fits the data best ie.e. minimum RMSE without applying all of polynomial fittings programatically.
The code i could think of is given below, I will appreciate if some one can help me with it.
i=7
j=48
for i=1:7
for j=1:48:48
[Fit5, gof5] = fit( x([1:j]), y([1:j]), 'poly5' );
coeff5=coeffvalues(Fit5);
end
end
Risposte (3)
dpb
il 12 Mag 2017
Fitting a polynomial is probably not going to work all that well and certainly a fifth-order one is likely quite unstable.
W/o data to see what it is the curve looks like, hard to give any real conclusive answer, but if the purpose is interpolation, consider piecewise splines instead.
That aside, your code above has a problem with j=1:48:48 as a loop count expression; it is just j=1
Altho it is not my recommendation to do this (see above note), a more concise way to operate over your set of equi-spaced data would be to reshape the vectors to 2D array and operate by column...
x=reshape(x,48,[]).'; % arrange x, y by day (48 observations/day)
y=reshape(y,48,[]).'; % as column arrays
nc=size(x,2) % number columns --> days
nDeg=5; % the poly degree (your value, see above notes)
b=zeros(nDeg+1,nc)) % allocate room for the coefficients
for i=1:nc
b(:,i)=polyfit(x(:,i),y(:,i),nDeg); % and do the fitting
end
Undoubtedly the above will give some numerical issues warnings; (see notes above) but if you're adamant about trying, use the optional output variables as documented for polyfit to at least standardize the design matrix before solving. Then you'll need to add to the saved results the output structure returned to use for the evaluation similarly as to the coefficients array above.
Did I say I don't recommend this, yet? :) Look at splines and give us a sample (smallish) dataset...
13 Commenti
zafar khan
il 12 Mag 2017
dpb
il 12 Mag 2017
It ain't necessarily the ends alone...it'll have wiggles all along depending upon the shape of the input. It is also totally unconstrained as to how big or where those inflections are.
zafar khan
il 12 Mag 2017
dpb
il 12 Mag 2017
I'd feel much more at ease here if had some data to observe... :)
If you have the Curve Fitting TB, look at
doc fit % more general fit object has more outputs
If not, the optional second output of polyval includes the norm of the residuals which is dependent on the quality of fit. Of course, compute Rsq is pretty-much trivial
Rsq = 1 - SSE/SST
and
SST=dot(y-meany);
SSE=S.normr^2;
As you add terms, Rsq becomes more biased as don't account for the number of terms in the fit, so probably you'll want to look at Radj. Once you've got Rsq, this too is trivial
N=length(y);
Radj = 1 - (1-Rsq)*(N-1)/(N-k-1);
where k is degree of the polynomial.
Did I say I'd be a lot more at ease with providing this if could see some sample data, yet...???
John D'Errico
il 12 Mag 2017
Note that I very deliberately choose not to wade into this question, even though I know a moderate amount about piecewise modeling. :)
A big reason for that is because I expect the OP will not (SHOULD not) be happy with the result of such a piecewise fit, since each segment is being computed with no relation to its neighbors. Since higher order polynomial segments will have their largest uncertainty at the ends of the support, the fit will be crappy across each arbitrarily chosen break point.
That will surely be a problem down the line. Effectively, the OP is trying to use a home-brewed spline fit, but without any kind of apparent understanding of what splines are or how to fit them.
The reason I'm commenting now is because of the request for automatic selection of model order here. Note that it is very easy to overfit polynomial models, especially if you leave the task to be done automatically.
My recommendation would be to use a spline of some ilk, so smoothing or a least squares spline are options. A Savitsky-Golay smoother is another good choice, depending on the goals for the result. If you are just looking to smooth some data for a better plot, then just use a Savitsky-Golay, or some other simple filter. If the goal is predictive ability, then a spline is arguably correct.
zafar khan
il 13 Mag 2017
Image Analyst
il 13 Mag 2017
"i will need the fitted values for each sample or column and will need to select a fit which suits the data best." <=== What is the difference between those two needs? A fit will be the best fit for the assumptions you are making, so it will already be the best. Those two needs are one and the same. You could make other fits for each column, like use different window widths or different polynomial orders. But you have to define what the overall best means to you. Like I said, each one if the best for the assumptions/parameters you used. Just selecting the parameter set with the lowest mean absolute deviation or RMS does not mean that is necessarily the best (because the highest order fit that hugs the noisy curve almost exactly will, of course, have the lowest MAD).
John D'Errico
il 13 Mag 2017
Ugh.

Crappy data. An apparent discontinuity in the blue curve. A clear outlier in the green curve. Note that outliers will blow away any polynomial fit. Even a straight line fit will be hurt badly. A robust fitting tool can be a good idea, like robustfit from the stats toolbox. But robustfit is not designed to fit a polynomial like polyfit does, so you would need to feed it the proper matrix.
As well, robustfit cannot be of much help in deciding which polynomial order to use, were you to use a polynomial. Worse, you may be lacking any good statistical methods to decide a reasonable polynomial order, in the presence of significant outliers.
Worse yet. The presence of significant outliers in a short series of points will cause even tools like robustfit to have problems trying to fit a polynomial model.
UGH.
zafar khan
il 14 Mag 2017
dpb
il 15 Mag 2017
Ewww...there's a typo in the title--that's for 1830 hr, not 1630...
zafar khan
il 15 Mag 2017
dpb
il 16 Mag 2017
Well, whatever, it is your problem. I've removed the annoying info.
Image Analyst
il 13 Mag 2017
0 voti
See my canned Savitzky-Golay filter demos, attached. Don't use a filter order more than about 2 or 3 or you won't see much smoothing. A 5th order polynomial will hug the curve fairly closely and not provide much smoothing.
Categorie
Scopri di più su Linear and Nonlinear Regression in Centro assistenza e File Exchange
Prodotti
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!