Dealing with Large training datasets saved in a number of .mat files
Mostra commenti meno recenti
Hello all,
I have run into a problem where I need to train a LSTM signal classifier with huge amount of data.
Each 1D signal is around 100k samples, every 48 signals are saved in a .mat file. The total number of .mat files is around 2000.
The labels are similarly saved in corresponding .mat files in a different folder.
I would like to know if there's a way to train the network without the necessity of loading the whole thing into memory. (with 64GB ram I can only load ~1300files at once)
Your help will be very much appreciated.
Risposta accettata
Più risposte (1)
Frantz Bouchereau
il 20 Ago 2020
Modificato: Frantz Bouchereau
il 20 Ago 2020
0 voti
Ruohao,
You can use two signalDatastores - one to read your signal files and another one to read your labels. You can then combine them using combine(), split the combinedDatastore into training and test sets using subset() and then feeding the combined datastores into the training function of the LSTM network.
With signslDatastore you do not need to write a load function. You specify the variable names you want read from the mat file and those are returned at every read.
HTH
1 Commento
Ruohao Zhang
il 21 Ago 2020
Categorie
Scopri di più su Deep Learning Toolbox in Centro assistenza e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!