How to arrange time series data into yearly groups?
Mostra commenti meno recenti
Hi,
I have daily temperature time series from 01/03/1961 to 28/07/2018. I want to arrange this data as per water year convention (01 June to 31 May) in following form:
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Temperature Temperature Temperature
Date (1960-1961) Date (1961-1962) . . . . . . . . . . . (2017-2018)
01/06/1960 01/06/1961
. .
. .
. .
. .
. .
31/05/1961 31/05/1962
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Finally, I need this data table in following form
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
= = = = = = = = = = = = Temperature = = = = = = = = = = = =
Day 1960-1961 1961-1962 . . . . . . . 2017-2018
1
2
3
.
.
.
.
.
.
365
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Please tell me how to do this.
Thanks in advance.
Risposta accettata
Più risposte (1)
Joe Vinciguerra
il 17 Nov 2019
Modificato: Joe Vinciguerra
il 18 Nov 2019
Start by importing your data with either readtable or readtimetable (timetable is more powerful when working with dates, but can be more complicated).
The simplest next step is to create a second table formatted how you want your final data to look: first column is 1:366 (leap years), headers are the year labels. Header labels can be populated using something like
for i=1:57
h(i) = join([string(i+1959),'-',string(i+1960)])
end
Assign h to your table variable names.
Then use a for loop to populate the final table.
Here's one way to do it:
%% The following script assumes dataset is complete, contiguous, and contains more than 2 years of data or you will get errors
% Import the data:
dataInitial = readtimetable("Data.xlsx"); % import the data into a timetable
% Setup time-related variables
% YearFirst = min(year(dataInitial.Datetime)); % the first year in the dataset
YearFirst = 1960; % use this, OR the line above instead
% YearLast = max(year(dataInitial.Datetime)); % the last year in the dataset
YearLast = 2018; % use this, OR the line above instead
dateStart = datetime([YearFirst 6 1]); % the first date in the dataset
dateEnd = datetime([YearLast 5 31]); % the last date in the dataset
oneYear = calendarDuration([1 0 0]); % calendar duration signifying 1 year
bins = dateStart:oneYear:dateEnd; % array containing the start date of each column for the final table
nbins = length(bins); % the number of data columns in the final table
YearRange = YearFirst:YearLast; % array of all years in the dataset
header = strings(1,nbins); % preallocate variables that will be created in a loop
for i = 1:nbins
header(i) = join([string(YearRange(i)),'-',string(YearRange(i+1))]); % generate an array of strings for the headers
end
% setup the output table
tableSize = [366, nbins+1]; % the size of the final table (+1 for column containing day numbers)
varTypes = cell(1, nbins+1); % create an empty cell array for the variable types
varTypes(1) = {'int8'}; % use 'int8' for the day number column
varTypes(2:end) = {'double'}; % use 'double' for all other data elements
dataFinal = table('Size',tableSize,'VariableTypes',varTypes,'VariableNames',['day',header]); % create an empty table to store the results
% arrange the imported dataset into the formatted table
dataFinal.day = (1:366)'; % insert the day number into the first column
for i = 1:length(bins) % for each column of data...
dateRange = timerange(bins(i),bins(i) + oneYear); % identify the date range of interest for this column
dataTemporary = dataInitial(dateRange,:); % extra step for clarity
dataFinal(1:height(dataTemporary),i+1) = dataTemporary; % insert the data extracted into the final table where it belongs
end
11 Commenti
Parthu P
il 17 Nov 2019
Joe Vinciguerra
il 17 Nov 2019
Sure. How do you want to account for leap years? For example, do you want some rows to be 365 elements and others to be 366, or should they all be 366 with some having a zero or NaN on February 29th (day 60), etc?
Parthu P
il 17 Nov 2019
Joe Vinciguerra
il 18 Nov 2019
Modificato: Joe Vinciguerra
il 18 Nov 2019
I edited my original answer to reflect your request for more details.
Let me know if you have additional questions, else please accept my answer.
Cheers.
Parthu P
il 19 Nov 2019
Joe Vinciguerra
il 19 Nov 2019
I don't have that version installed, but it seems that the error describes that the table function doesn't like the string '1960 - 1961' as a variable name. Matlab generally prefers variables that start with letters and don't use spaces, so try this instead:
header(i) = join(['Y_',string(YearRange(i)),'-',string(YearRange(i+1))],'');
The ability to use any name for variables in a table was introduced in R2019b. Indeed, in previous versions table variables must follow the same rules as any variable in matlab, so must start with a letter and can't contain spaces or anything other than letters, numbers or _.
So that join is still not going to work because of the '-'.
Note that using compose or sprintf would make the string construction more readable:
header(i) = sprintf("Y%d_%d", YearRange(i), YearRange(i+1));
Joe Vinciguerra
il 19 Nov 2019
Guillaume, thanks for catching that "-" issue too.
Guillaume
il 20 Nov 2019
"but final table only shows appropriate headers but in data cells all 'zero'."
I'm not entirely sure what this mean. Anyway, with which code do you get that?
"I get following error"
This would indicate that for some reason i is a timetable, not the loop index. Anyway, I've given you much simpler code (just 4 lines) which will do the same job.
An easier way to get the timetable in pre-R2019a is with:
dataInitial = table2timetable(readtable('Data.xlsx'));
Parthu P
il 21 Nov 2019
Categorie
Scopri di più su Tables in Centro assistenza e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!