Large Files and Big Data
Large data sets can be in the form of large files that do not fit into available memory or files that take a long time to process. A large data set also can be a collection of numerous small files. There is no single approach to working with large data sets, so MATLAB® includes a number of tools for accessing and processing large data.
Begin by creating a datastore that can access small portions of the data at a time. You can use the datastore to manage incremental import of the data. To analyze the data using common MATLAB functions, such as
histogram, create a tall array on top of the datastore. For more complex problems, you can write a MapReduce algorithm that defines the chunking and reduction of the data.
Read large collections of data
- Tall Arrays
Arrays with more rows than fit in memory
Programming technique for analyzing data sets that do not fit in memory
- Large MAT-Files
Access and change variables without loading into memory
- Parquet Files
Read and write Parquet files
- Memory Mapping
Map file data to memory for faster access