Is it possible to train a NARX model using multiple data sets from the same time series?
7 visualizzazioni (ultimi 30 giorni)
Mostra commenti meno recenti
For example:
I am training a NARX model using MATLAB to predict wind speed.
The NARX model is to be trained using exogenous training data (100 time steps x 3 inputs) and related wind speed values from 40 different locations. I.e. (100 x 3) x 40
My questions are:
Is it possible to tell the NARX model that the training data contains data from 40 different locations for the same time series. I.e. 1,2,3..100. Assuming query #1 is achievable, when the NARX model is trained, is it possible to predict the wind speed of the same time series as the training data (1,2,3..100) of a 41st location. With this, is it possible for the NARX model to update the weights and biases specific to the 41st location. Any help appreciated.
If NARX modelling is not the answer to my problem, do you know of any modelling technique that may be of interest to me?
0 Commenti
Risposte (2)
Greg Heath
il 2 Lug 2017
Theoretically, it can be done in a quadruple loop
m = 1 : Ntrials
n = 1 : 40 % 40 locations per trial
i = 3 * n - 2 : 3*n % 3 inputs per location
j = 1 : 100-d % 100 - d timesteps from d delays
Looks nasty. Look for a way to attack it in via subproblems.
Hope this helps.
Good Luck!
Greg
2 Commenti
Greg Heath
il 13 Lug 2017
Modificato: Greg Heath
il 13 Lug 2017
After rereading your question I realize that I do not really understand your problem. Please take the time to explain it in more detail.
What are your inputs and how many are there
[ I N ] = size(inputmatrix) = [ ? ? ]
What are your targets and how many are there
[ O N ] = size(targetmatrix) = [ ? ? ]
Greg
christttttttophe
il 28 Gen 2020
Hi,
Don't know if anybody answered your question but it does appear that Matlab can train a NARX net with multiple trials (or sequences) recorded at the same time although there are a few caveats. The main point is that you have to format your data correctly in the cell format however. It's a terrible shame that Matlab hasn't really addressed this in any documentation as it is not clear to anybody apparently and the examples do not really address this at all.
Step 1: So for example if you have a 1x10 cell array, each of the cells (1 through 10) represents the timesteps of your data. If you wanted 20 timesteps you would have a 1x20 cell array and so on and so forth. This is the case for both your inputs and targets.
Step 2: Ok now to look at each cell in the 1x10 cell array (still using 10 timesteps). Let's say I have 10 inputs corresponding to 1 target of double data and I have 20 trials or 20 sets of data recorded at one time. Each individual cell in the Input array will be 10x20 double. Each cell in the Target array will be 1x20 double. Obviously doesn't have to be double but giving an example.
Step 3: So you have a 1x10 cell array. Each of these cells contain either 10x20 double or 1x20 double data for the input or target array.
Step 4: Now each cell must be 10x20 double or 1x20 in size. If your inputs or targets change in size then you will need to pad with nans to make each cell the same size.
To pad the targetseries and inputseries arrays you can use the code below:
%Pad nans to targetseries and inputseries
maxSize = max(cellfun(@numel,targetSeries)); %# Get the maximum vector size
fcn = @(x) [x nan(1,maxSize-numel(x))]; %# Create an anonymous function
targetSeriesNN = cellfun(fcn,targetSeries,'UniformOutput',false); %# Pad each cell with NaNs
SizeofInputArray=size(inputSeries{1,1});
SizeofInputArray=SizeofInputArray(1,1);
maxSize = max(cellfun(@length,inputSeries)); %# Get the maximum vector size
fcn = @(x) [x nan(SizeofInputArray,maxSize-size(x,2))]; %# Create an anonymous function
inputSeriesNN = cellfun(fcn,inputSeries,'UniformOutput',false); %# Pad each cell with NaNs
Step 5: So now you are ready to train your data. The best way to do this intuititvely is to use the usual split 70,15,15 in the input/target data both at each timestep and also in time. So for example the first cell (and the first timestep data) may be found in the 1x1 cell. So the simple way would be to split your first timestep data 70,15,15 for both the input and target data and then subsequent timesteps in the same way. Your first timesteps has 20 trials of data and so the first 14 inputs would be used for training and then the next 3 would be used for test and the next 3 for validation. This also needs to be done in time which makes this tricky (ideally the first 70% of the timestep data would be used for training and then the last 30% would be used for test and validation. So it seems divideblock needs to be combined with a divide mode of sample for this. The best I have come up with is net.divideMode = 'sampletime'. Although maybe somebody could correct me on whether that is true or whether I am wrong. I would appreciate any feedback anybody has?
Step 6: Train away.
One issue I have is that my inputs are not always of the same size and thus I get a nan when doing the prediction or training output. The issue stems from the fact that when using the data from previous delays it uses the same exact index in the array and so the nan causes issues. I naively thought the training used all the inputs at every index.
So if my first input is 1x15 although padded to 1x20 using nans
My second input is 1x20.
My third input is 1x20.
My output target will be nan from 1x15 to 1x20 will be nan because of the nan padded in the first input.
1 Commento
Torsten K
il 12 Set 2020
Modificato: Torsten K
il 12 Set 2020
Hello Christophe,
thank you for your instructions for data preparation and training of a NARX network with several input sequences.
I transferred my data (3 input variables, 1 target, 120 time series of different lengths, maximum length is 2591) into two cell arrays X_mul and Y_mul. X_mul has the dimension 1x2591 and contains a double array of the dimension 3x120 in each cell. T_mul has the dimension 1x2591 and contains a double array of the dimension 1x120 in each cell.
X_mul = 1×2591 cell array
Columns 1 through 9
{3×120 double} {3×120 double} {3×120 double} {3×120 double} {3×120 double} {3×120 double} {3×120 double} {3×120 double} {3×120 double}
[...]
Columns 2584 through 2591
{3×120 double} {3×120 double} {3×120 double} {3×120 double} {3×120 double} {3×120 double} {3×120 double} {3×120 double}
My open-loop training looks like this:
net = narxnet(1:10,1:10,5);
net.name = ['NARX-Net (Open Loop)'];
% Initialise weight and bias values (see '>> doc init')
net.initFcn = 'initlay';
net.layers{1}.initFcn = 'initwb';
net.inputWeights{1,1}.initFcn = 'rands';
net.biases{1,1}.initFcn = 'rands';
net.layers{2}.initFcn = 'initwb';
net.layerWeights{2,1}.initFcn = 'rands';
net.biases{2,1}.initFcn = 'rands';
net.divideFcn = '';
net.trainFcn = 'trainlm';
net.layers{1}.transferFcn = 'tansig';
net.inputs{1}.processFcns = {'removeconstantrows','mapminmax'};
net.inputs{2}.processFcns = {'removeconstantrows','mapminmax'};
[Xs1,Xi1,Ai1,Ts1] = preparets(net,X_mul,{},T_mul);
[net,tr] = train(net,Xs1,Ts1,Xi1,Ai1);
Ys1 = net(Xs1,Xi1,Ai1);
Now the following is still unclear to me:
1) How can I achieve that the training is divided into 70% training, 15% validation and 15% test data, i.e. time series 1..84 for training, time series 85..102 for validation and the time series 103..120 for the test without the time series being torn apart?
2) How can you graphically display the result for all 120 time series with 'plotresponse'? To do this, I somehow have to extract the time series individually and feed them into the trained network, but I don't know how.
I would be very grateful for an idea or a hint!
Best wishes
Torsten
Vedere anche
Categorie
Scopri di più su Sequence and Numeric Feature Data Workflows in Help Center e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!