Main Content

parallelize

Class: matlab.compiler.mlspark.SparkContext
Package: matlab.compiler.mlspark

Create an RDD from a collection of local MATLAB values

Syntax

rdd = parallelize(sc,cellArray)
rdd = parallelize(sc,cellArray,numSlices)

Description

rdd = parallelize(sc,cellArray) creates an RDD from a collection of local MATLAB® values grouped as a cell array.

rdd = parallelize(sc,cellArray,numSlices) creates an RDD with the number of partitions specified by numSlices.

Input Arguments

expand all

The SparkContext to use, specified as a SparkContext object.

A collection of values, specified as a MATLAB cell array.

Data Types: cell

Number of partitions to create, specified as a scalar.

Data Types: double

Output Arguments

expand all

An output RDD, created from the collection of values, and returned as an RDD object.

Examples

expand all

Create an RDD from local MATLAB values.

%% Connect to Spark
sparkProp = containers.Map({'spark.executor.cores'}, {'1'});
conf = matlab.compiler.mlspark.SparkConf('AppName','myApp', ...
                        'Master','local[1]','SparkProperties',sparkProp);
sc = matlab.compiler.mlspark.SparkContext(conf);

%% parallelize
x = sc.parallelize({1, 2, 3, 4, 5});
y = x.count()
Introduced in R2016b