datastore function creates a datastore, which is a repository for collections of data that are too large to fit in memory. A datastore allows you to read and process data stored in multiple files on a disk, a remote location, or a database as a single entity. If the data is too large to fit in memory, you can manage the incremental import of data, create a
tall array to work with the data, or use the datastore as an input to
mapreduce for further processing. For more information, see Getting Started with Datastore.
|Create datastore for large collections of data|
|Datastore for tabular text files|
|Datastore for spreadsheet files|
|Datastore for image data|
|Datastore for collection of Parquet files|
|Datastore with custom file reader|
|Datastore for in-memory data|
Read and Write from Datastore
Subset, Partition, or Shuffle Datastore
|Create subset of datastore or FileSet|
|Determine whether datastore is subsettable|
|Shuffle all data in datastore|
|Determine whether datastore is shuffleable|
|Number of datastore partitions|
|Partition a datastore|
|Determine whether datastore is partitionable|
Combine or Transform Datastores
|Datastore to combine data read from multiple underlying datastores|
|Sequentially read data from multiple underlying datastores|
|Datastore to transform underlying datastore|
Integrate with MapReduce and Tall Arrays
|Datastore for key-value pair data for use with
|Datastore for checkpointing |
Develop Custom Datastore
|Base datastore class|
|Add parallelization support to datastore|
|Add subset and fine-grained parallelization support to datastore|
|Add Hadoop support to datastore|
|Add shuffling support to datastore|
|File-set object for collection of files in datastore|
|File-reader object for files in a datastore|
|Add file writing support to datastore|
|Add Folder property support to datastore|
|File-set for collection of files in datastore|
|Blocked file-set for collection of blocks within file|
- Getting Started with Datastore
A datastore is an object for reading a single file or a collection of files or data.
- Select Datastore for File Format or Application
Choose the right datastore based on the file format of your data or application.
- Read and Analyze Large Tabular Text File
This example shows how to create a datastore for a large text file containing tabular data, and then read and process the data one block at a time or one file at a time.
- Read and Analyze Image Files
This example shows how to create a datastore for a collection of images, read the image files, and find the images with the maximum average hue, saturation, and brightness (HSV).
- Read and Analyze MAT-File with Key-Value Data
This example shows how to create a datastore for key-value pair data in a MAT-file that is the output of
- Read and Analyze Hadoop Sequence File
This example shows how to create a datastore for a Sequence file containing key-value data.
- Work with Remote Data
Work with remote data in Amazon S3™, Azure® Blob Storage, or HDFS™.
- Set Up Datastore for Processing on Different Machines or Clusters
Setup a datastore on your machine that can be loaded and processed on another machine or cluster.
- Develop Custom Datastore
Create a fully customized datastore for your custom or proprietary data.
- Develop Custom Datastore for DICOM Data
This example shows how to develop a custom datastore that supports writing operations.
- Testing Guidelines for Custom Datastores
After implementing your custom datastore, follow this test procedure to qualify your custom datastore.