Info

Questa domanda è chiusa. Riaprila per modificarla o per rispondere.

Neural network: train() behavior with earlier results

1 visualizzazione (ultimi 30 giorni)
Akshay Joshi
Akshay Joshi il 28 Gen 2018
Chiuso: MATLAB Answer Bot il 20 Ago 2021
I have a very large dataset of around 150GB that I need to process using neural networks. As this data is quite big, I've to break it into chunks, say 5000 elements are sent as 20 batches, each batch containing 250 elements. The following dummy code can be written for this:
for count = 1:num_batches
inputs = entire_input(1 + (count-1)*num_batches, count * num_batches);
targets = entire_targets(1 + (count-1)*num_batches, count * num_batches);
net = train(net, inputs, targets);
end
Will the net again start training with the fresh batch, or will it be able to retain weights calculated for previous batch? As per some of my discussions and findings, with each new batch, the weights start taking shape of current data and may overwrite previous weights.
Please advise if this method works well, or we can use some other method instead of train().

Risposte (1)

Greg Heath
Greg Heath il 29 Gen 2018
"Need to process" doesn't provide useful information.
What are you trying to design? Curvefitter/Regressor? PatternRecognizer/Unsupervised-Classifier/Supervised-Classifier? Timeseries??
In all cases, training, validation and test data should have similar summary statistics in all run batches. Otherwise training batch n will erase some of what is learned in batches 1 to n-1.
Your response should be far less vague than your original explanation.
Hope this helps.
Greg
Thank you for formally accepting my answer
  1 Commento
Akshay Joshi
Akshay Joshi il 29 Gen 2018
Modificato: Akshay Joshi il 29 Gen 2018
Hi Greg,
I'm trying to design a supervised classifier with the help of multi layer perceptrons ( feedforwardnet). The input matrix is of 500,000 x 25 dimension, and output matrix 5,000 x 25.
Initially, I tried to train my network using nntool. But I was unable to feed dataset this large (150 GB) into it due to memory constraints, so decided to break data into chunks. For this purpose, I'm writing a Matlab script to create neural network and provide input in chunks.
In all cases, training, validation and test data should have similar summary statistics in all run batches.
Otherwise training batch n will erase some of what is learned in batches 1 to n-1.
Can you suggest some method through which we can retain the data of 1 to n-1 batches, and based on that, we calculate the result of say n to n+k batches?
Thanks for the earlier response. Hope I'm clear this time.

Questa domanda è chiusa.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by