Interpolation of in-between values in a list of different groups
2 visualizzazioni (ultimi 30 giorni)
Mostra commenti meno recenti
Clemens Gersch
il 30 Mar 2020
Modificato: Guillaume
il 31 Mar 2020
Hi,
I have a dataset zerocd, from which I uploaded an extract. The variable zerocd.days can be understand as days in the future from the date of the same row. What I want to do is to interpolate the rates for days = 1:100 on each date. The interpolation should be based on the rate vector of each zerocd.date, so that for a specific date the rates of alle all other dates are irrelevant. The interpolation should be done for every date in the dataset.
As there are 3 different dates in my extract, the goal is to have a table in the same structure as zerocd, but it should contain 300 rates, one for every combination of date (the three dates given) and days(1:100).
Please keep in mind, that the values of zerocd.days are not the same for every zerocd.date!!!
Right now, my code looks like this and is very very slow. Do you have suggestions for improvement?
% Get unique zerocd dates.
date = unique(zerocd.date);
n_date = numel(date);
% Get query vector for interpolation.
days_queried = (1:100)';
% Construct new table for interpolated values.
rate = nan(n_date, numel(days_queried));
Interpolated = table(date, rate);
Interpolated = splitvars(Interpolated, 'rate');
% Do rate-interpolation for all days on each date i
for i = 1:numel(date)
daily_zerocd = zerocd(zerocd.date == Interpolated.date(i),:);
daily_Interpolant = griddedInterpolant(daily_zerocd.days, ...
daily_zerocd.rate, 'linear', 'linear');
Interpolated{i,2:end} = daily_Interpolant(days_queried)';
end
% Stack to list format again.
Interpolated = stack(Interpolated,2:size(Interpolated,2),...
'NewDataVariableName','rate',...
'IndexVariableName','days');
Interpolated.days = repmat(Xq,n_date,1);
EDIT: Here is a picture of the general principle.
0 Commenti
Risposta accettata
Guillaume
il 31 Mar 2020
Modificato: Guillaume
il 31 Mar 2020
I've not tried to understand your code to see where the slow processing is (edit: it's probably the stack) There's no reason for this processing to be slow.
Here is how I'd do it. The splitapply could be replaced by an explicit loop which may even be faster:
% Get query vector for interpolation.
days_queried = (1:100)';
%find unique date and assign unique ID to rows of the table
[rowid, date] = findgroups(zerocd.date); %or [date, ~, rowid] = unique(zerocd.data); %if using unique
%interpolate relevant rows for each unique day
%with splitapply, output a scalar cell array containing a column vector of interpolated rate for each day.
daily_zerocd = splitapply(@(inday, inrate) {interp1(inday, inrate, days_queried, 'linear', 'extrap')}, zerocd.days, zerocd.rate, rowid);
%stuff it all in a table
Interpolated = table(repelem(date, numel(days_queried)), repmat(days_queried, numel(date), 1), vertcat(daily_zerocd{:}), ...
'VariableNames', {'date', 'days', 'rate'})
0 Commenti
Più risposte (0)
Vedere anche
Categorie
Scopri di più su Interpolation in Help Center e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!