Main Content


Class: matlab.compiler.mlspark.SparkContext
Package: matlab.compiler.mlspark

Convert MATLAB datastore to a Spark RDD


rdd = datastoreToRDD(sc,ds)


rdd = datastoreToRDD(sc,ds) converts a MATLAB® datastore object ds to a Spark™ RDD.

Input Arguments

expand all

The SparkContext to use, specified as a SparkContext object.

Datastore to be converted to an RDD, specified as a MATLAB datastore object.

Output Arguments

expand all

Output RDD representing the converted datastore object, returned as a RDD object.


expand all

Convert a MATLAB datastore object to a Spark RDD.

% Setup Spark Properties as a MATLAB Map object using a containers.Map class
sparkProp = containers.Map({'spark.executor.cores'}, {'1'}); 

% Create SparkConf object
conf = matlab.compiler.mlspark.SparkConf(...
    'AppName'        , 'myApp', ...
    'Master'         , 'local[1]', ...
    'SparkProperties',  sparkProp );
% Create a SparkContext
sc = matlab.compiler.mlspark.SparkContext(conf);

% Create a MATLAB datastore 
ds = datastore('airlinesmall.csv','TreatAsMissing','NA');

% Convert MATLAB datastore to Spark RDD 
rdd = datastoreToRDD(sc,ds);

% Alternate object usage:
rdd = sc.datastoreToRDD(ds);
Introduced in R2016b