How can i improve the performance of a closed loop NARX neural network?
    8 visualizzazioni (ultimi 30 giorni)
  
       Mostra commenti meno recenti
    
I've created an open loop NARX network for system identification with (ntstool) toolbox. My open loop network's performance is 1e-07 but when I close the network the error(mse) highly increases (around 1.15). Is there any way that I can do for improving the close loop performance
0 Commenti
Risposta accettata
  Greg Heath
      
      
 il 13 Feb 2013
        Try narxnet. Use the autocorrelation of the target and the crosscorrelation of input and target to find the lags that are statistically significant for input and feedback delays.
Hope this helps.
Thank you for formally accepting my answer
Greg
4 Commenti
  Lucas Ferreira-Correia
 il 18 Lug 2019
				I realise this thread is quite old, but by original data do you mean the same data used to train the open loop?
  Juan Hynek
 il 26 Ago 2019
				
      Modificato: Juan Hynek
 il 17 Ott 2019
  
			Hi Lucas,
I have received good results from using the same data but using the weights determined by the open-loop training as a starting point for closed-loop training. Also make sure to initialise open-loop training to predetermined weights when working with large datasets. This will help avoid local minima.
Più risposte (6)
  Greg Heath
      
      
 il 21 Feb 2013
        
      Modificato: Greg Heath
      
      
 il 21 Feb 2013
  
      I have run ramin's 11 Feb 2013 at 13:13 timedelaynet code , with modifications, on the simpleseries_dataset using the same input parameters (except for series length and 'divideblock'):
 size(input) = [ 1 100]
 size(target) = [ 1 100]
 ID = 0:9 
 H = 65 
 trn/val/tst ratios = 0.65/0.20/0.15 
 divideFcn = 'dividerand'  % default, but I used 'divideblock'
 net.trainParam.goal = 0   % default, but I used 0.01*MSEtrn00
 I also used Ntrials = 20  % Multiple random weight initializations
To begin with there are two obvious suspects.
1. 'dividerand' is used which destroys all of the correlations on which good timeseries performance is based.
2. H is HUGE. Therefore, there is a good chance the openloop design is overfit with too many weights. If the overfit net is overtrained, it will fit the training data almost perfectly, but may fit nontraining data very,very badly.
Recall that if there are only Ntrneq = Ntrn*O = 60 training equations to estimate Nw = (10+1)*65+(65+1)*1 = 781 unknown weights, there are an infinite number of solutions that will yield approximately zero training error. However, most of these solutions will not yield acceptable nontraining error.
Unfortunately, documentation examples and the GUI based codes only yield the combined performance on trn/val/tst data without looking at each one separately. The total performance measure looked great with R^2 close to 1. However, when I separated the performances I found that the 65% training data R^2 were very close to unity but the validation(20%) and test (15%) data performances were so bad they were NEGATIVE! That means that just using the mean target value for the output yields a better nontraining performance (R^2= 0).
So, remember, when evaluating a net, look at the nontraining data performances!
Bottom line: Your open loop performance was really terrible and your closed loop performance just followed suit.
result =
         Trial         Epochs   R2trn         R2val       R2tst
            1            2      0.99908      -1.6637     -0.76142
            2            2      0.99934      -0.55658   -2.136
            3            2      0.99985      -1.5766      -1.24
            4            2      0.99956      -0.93949    -0.60495
            5            2      0.99396      -1.3556      -1.038
            6            2      0.99935      -3.1616       -2.636
            7            2       0.9994       -0.080548   -1.7186
            8            2      0.99993      -0.18275     -0.27919
            9            2      0.99858       0.043727    -0.83779
           10            2      0.99989      -3.2046       -2.1909
           11            2      0.99934      -1.2554       -1.434
           12            2      0.99894     -0.56353      -0.40946
           13            2      0.99986      -1.3566      -2.1323
           14            2      0.99971      -1.1912       -0.012865
           15            2      0.99992     -0.69673      -1.9198
           16            2      0.99984      -1.7842        -0.58905
           17            2      0.99969     -0.71866       -1.226
           18            2      0.99945      -1.0188        -5.066
           19            2      0.99993      -3.1073        -1.3749
           20            2      0.99994      -3.5529         -2.9
Hope this helps,
Greg
P.S. The largest value for H that will keep Nw < Ntrneq is Hub = 8!
0 Commenti
  Greg Heath
      
      
 il 21 Feb 2013
        FRANCISCO on 17 Feb 2013 at 17:09
% I've been testing my code in pollution_dataset with indications dao me % and I have several questions that I would like to comment:
What does "dao" mean?
% 1-VARIABLES ON DIFFERENT SCALES,. SHOULD NORMALIZE You % pollution_dataset normalizes the variables?
No. I standardized the variables (help/doc zscore) to have zero-mean and unit-variance.
% 3-NARXNET INPUTS ARE NOT OPTIMAL. Significant FIND AUTO AND % CROSSCORRELATIONS. H VARY IF NEEDED You how to do this step?, Trains the % network and observes the autocorrelation and crosscorrelation charts and % notes that the errors are within the confidence limits (dotted red line)?
No. Calculate the auto and cross correlation functions on the data. Use xcorr or crosscor if you have the appropriate toolboxes. Otherwise use nncorr which has bugs which are corrected in some of my recent code (search NEWSGROUP an ANSWERS using greg nncorr). Also can use ifft(fft(conj(a).*b)/N.
If you can't find the significance levels from the program or manuals, then calculate the crosscorrelations for two randn(1,N) functions. Sort the 2*N-1 values and the one that is 95% from the beginning has the approximate value of the 95% significance threshold. I typically run 100 trials and average the results.
% 4-DO NOT USE THE DEFAULT DIVIDERAND I've tried also with dividetrain but % the data will not let me divide them into train, validation, test because % I guess as in help, do train with all targets. Still, I think it makes a % lot error. I'd like you to tell me how you use dividetrain.
There is no division in dividetrain. Use divideblock
% 5-INITIALIZE THAT RUNS THE RNG SO CAN BE DUPLICATED How I do it and when? % After training and simulation openloop?. I use rng (sprev)?
Before the doubleloop over H and random initial weights. I have posted many, many examples
% 6-NORMALIZE MSE THE VARIANCE WITH THE AVERAGE AND TARGET OBTAIN THE % DETERMINATION OF COEFFICIENT R ^ 2 I have read the wikipedia and do not % know how to do this apartado.Habría some code where I could see? I find % it hard to understand.
Search greg R2 R2a
% 7-H ~ Hub/10 I do not understand this indication.
Neq > Nw when H <= Hub
Search greg Nw Hub
% If you know of a place where he could see the complete code concerning % these questions I'd appreciate it, applied to pollution_dataset or other % data in order to understand it better.
I had to revise the one I was working on when I found ramin's overfitting problem. It is a hobby, not a priority, so I cannot promise when I will finish.
2 Commenti
  FRANCISCO
 il 21 Feb 2013
				I think I understand the idea: 1 - in the autocorrelation and cross correlation I find the highest peaks and these lags would indicate that I should use 2 - the balance between equations and weights would be a rough estimate H. 3 - In the error evaluation is recommended to study the errors separately (train, validation, test), and this as you would?? 4 - I believe that the implementation of cross-validation would be an improvement in the accuracy of the prediction, of course would have to try it.
Thank you very much for your time Greg
  Greg Heath
      
      
 il 30 Lug 2014
				How would 10-fold crossvalidation be implemented on a timeseries where the order of surrounding data needs to be maintained???
I don't see it.
Greg
  Shashank Prasanna
    
 il 8 Feb 2013
        honestly, i wish there is one quick answer to this question but reality is there isn't. make sure your training set includes all dynamics you wish to see in the model. In addition how does your data look? is there a trend? if the data is not stationary, narx may not do a great job.
  Greg Heath
      
      
 il 9 Feb 2013
        See the answer I just posted to FRANCISCO re his implemetation. Like him, don't expect to get a good answer to your question without posting your code.
Running your code on MATLAB data
help nndata
will probably help us help you.
Greg
3 Commenti
  Greg Heath
      
      
 il 16 Feb 2013
        
      Modificato: Greg Heath
      
      
 il 16 Feb 2013
  
      NARXNET is the most general timeseries design function. TIMEDELAYNET and NARNET are special cases.
narxnet(ID,[],H) should yield the same results as timedelaynet.
narxnet([],FD,H) should yield the same results as narnet.
DO NOT USE the default 'dividerand'. Random sampling to create the trn/val/tst data division destroys correlations betwen the current output and the delayed inputs and outputs.
'divideblock' is probably the smartest choice. Although 'divideint' also yields uniform timesteps, it increases the timestep by a factor of 3.
Choose ID from the significant lags of the crosscorrelation function.
Choose FD from the significant lags of the autocorrelation function.
Do not use a very large value of H. That is analogous to fitting a noisy straight line with a high order polynomial. It will have erroneous wiggles between training points as well as before and after the domain of the training points. It neither interpolates well nor extrapolates well. So, even though the openloop results are acceptable, or even great, closed loop performance may be disastrously poor!
I use the estimation number of degrees of freedom as a guide.
 Ndof = Ntrneq-Nw
where
 Ntrneq = No. oftraining equations = prod(size(ttrn)) = Ntrn*O
and
 Nw = No. of weights to be estimated = net.numberWeightElements.
If MXID = max(ID) and MXFD = max(FD), for I-dimensional inputs and O-dimensional outputs,
 Nw = (MXID*I + MXFD*O +1 )*H + (H+1)*O
and the requirement Ntrneq > Nw yields the upper bound
 Hub = -1 + ceil( (Ntrneq-O) / ( MXID*I + MXFD*O + O +1) )
This can be exceeded using validation stopping and/or regularization. However, without them H << Hub is the best bet to mitigate noise and measurement error. The optimal value of the ratio r = Ntrneq/Nw depends on the data. However, I feel relatively safe with
 Hmax = floor(Hub/10)
but will try values up to floor(Hub/2) if necessary. Typically, I choose a reasonable range and spacing for candidate values of H (Hmin: dH : Hmax), and design numH*Ntrials candidate nets where Ntrials is the number of random initial weight configurations for each design. If the training error is etrn = ttrn-ytrn, the lowest training set error is estimated using
 MSEtrna = sse(etrn)/Ndof.
The DOF "a"djustment is used to decrease the bias in MSEtrn = sse(etrn)/Ntrneq caused by using the same data to estimate weights and evaluate the resulting performance.
Your poor closed loop performance could have been caused from the combination of using 'dividerand' with a very large H.
Hope this helps.
Thank you for formally accepting my answer.
Greg
3 Commenti
  Greg Heath
      
      
 il 16 Feb 2013
				You should be able to get better closeloop results without resorting to crossval for which the NNTBX has no function. I fixed your code and ran it on the pollution_dataset with interesting results R2a = 0.92 for openloop and 0.88 for closed loop. ID = 1:2, FD=1:2 and H = 16, divideFcn = 'dividetrain'. ID and FD are defaults and H ~ Hub/10
  FRANCISCO
 il 17 Feb 2013
				I've been testing my code in pollution_dataset with indications dao me and I have several questions that I would like to comment:
1-VARIABLES ON DIFFERENT SCALES,. SHOULD NORMALIZE   You pollution_dataset normalizes the variables?
3-NARXNET INPUTS ARE NOT OPTIMAL. Significant FIND AUTO      AND CROSSCORRELATIONS. H VARY IF NEEDED You how to do this step?, Trains the network and observes the autocorrelation and crosscorrelation charts and notes that the errors are within the confidence limits (dotted red line)?
4-DO NOT USE THE DEFAULT DIVIDERAND I've tried also with dividetrain but the data will not let me divide them into train, validation, test because I guess as in help, do train with all targets. Still, I think it makes a lot error. I'd like you to tell me how you use dividetrain.
5-INITIALIZE THAT RUNS THE RNG SO CAN BE DUPLICATED How I do it and when? After training and simulation openloop?. I use rng (sprev)?
6-NORMALIZE MSE THE VARIANCE WITH THE AVERAGE AND TARGET OBTAIN THE DETERMINATION OF COEFFICIENT R ^ 2 I have read the wikipedia and do not know how to do this apartado.Habría some code where I could see? I find it hard to understand.
7-H ~ Hub/10 I do not understand this indication.
If you know of a place where he could see the complete code concerning these questions I'd appreciate it, applied to pollution_dataset or other data in order to understand it better.
Thank you very much Greg.
Vedere anche
Categorie
				Scopri di più su Modeling and Prediction with NARX and Time-Delay Networks in Help Center e File Exchange
			
	Prodotti
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!






