train timedelaynet with large data set
1 visualizzazione (ultimi 30 giorni)
First, I am using Matlab R2018a, in a Ubuntu 18.04 machine with 2.7 GHz Intel core i7 cpu, dual-core and multithreading (4 threads) with 16Gb RAM and 32Gb swap.
I want to train a timedelaynet classifier (output transferFcn=logsig), with signal input data. I have 30 concurrent data pairs input-target (X,T), X's dimensions are 2*(1080000) aprox. (the second dimension changes a bit across the different concurrent data), and T's dimensions are 3*(size(X,2)).
I did a manual analysis to identify as best as i could similar behaivours, and i reduced from 30 to 11 concurrent data sets. But then I need to check how is generalization on unseen data, and I probably will have to add a few more.
Been a time series, X's and T's are cell arrays, and I did this to concatenate the concurrent sets.
I know i can do features extraction like waves bands powers [alpha, beta, etc] (which i did), but I want to compare the results of the nets trained with the extracted features and with raw input signal.
I've seen that is usually suggested to divide the input dataset in subsets and then create a new net which input would be the outputs of this nets (which in my case, it would be to create a net for each concurrent set). However, this is not an option for me (at least in a first thought), since the objective is to use this in realtime reading of the input signal, and any detour would delay the output (even the feature extraction net will have a delay in its output).
Now, if I just had to train 1 network, it wouldn't be much of an issue (just wait), but I'm applying a genetic algorithm to find the best network, so I would like this to execute in the least time as possible.
I am using the parallel computing toolbox in two ways:
- paralelizing the for loop that trains networks of each generation, and and training with UseParallel=no
- normal for loop, but training with UseParallel=yes.
With small data sets, first option turn out to be faster, since aparently second option doesn't use all availabe threads, despite expliciting "maxNumCompThreads(4)". First option with larga dataset run for two days and it didn't end the training of any network. Now I will test second option with the large dataset.
Another tip, after executing train command, it takes a lot of time to actually start training...
So, is there anything else I can do to speed things up (except get a good server, I've been doing thing in my personal computer till now).?