Azzera filtri
Azzera filtri

How to use tall arrays after using "write" to train machine learning models?

3 visualizzazioni (ultimi 30 giorni)
I'm generating some data as tables, converting them to a tall array, and then saving the tall array on my disk using the "write" function. I want to be able to use the tall array to train some machine learning models without loading the array into memory. Is that possible? What is the typical workflow for working with a tall array after writing it to the disk?
(I've tried to look through the documentation on tall arrays and haven't found anything yet on this question. All the examples I've found use a tall array in the workspace)

Risposte (1)

Gayatri Rathod
Gayatri Rathod il 27 Apr 2023
Hi Rachel,
The typical workflow for working with a tall array after writing it to disk is as follows:
1.Generate the data as tables and convert them to a tall array:
% Generate some data as tables
data = table(randn(10000,1), randn(10000,1), randn(10000,1));
% Convert table to tall array
tall_data = tall(data);
2.Save the tall array to disk using the write function:
% Write tall array to disk
write(tall_data,'datafile.mat');
3.Create a datastore to read the tall array from disk in chunks:
% Create a tall data source object from the saved file
tall_data_source = datastore('datafile.mat,'Type','Tall');
4.Use the subset of the tall array to train your machine learning models.
In this example, we generate some data as tables, convert them to a tall array, save the tall array to disk and then create a tall data source object from the saved file. The result can then be used to train machine learning models.
The datastore function creates a datastore, which is a repository for collections of data that are too large to fit in memory. A datastore allows you to read and process data stored in multiple files on a disk, a remote location, or a database as a single entity.
You can read more about the table, randn, tall, write and datastore functions from the following documentations: table function, randn function, tall function, write function, datastore doc, datastore function.
Hope it helps!

Prodotti


Release

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by