Main Content

reduceByKey

Class: matlab.compiler.mlspark.RDD
Namespace: matlab.compiler.mlspark

Merge the values for each key using an associative reduce function

Syntax

result = reduceByKey(obj,func,numPartitions)

Description

result = reduceByKey(obj,func,numPartitions) merges the values for each key in obj using an associative reduce function func. numPartitions specifies the number of partitions to create in the resulting RDD.

Input Arguments

expand all

An input RDD, specified as a RDD object.

Associative function to be applied to the elements of the input RDD, specified as a function handle.

Data Types: function_handle

Number of partitions to create, specified as a scalar value.

Data Types: double

Output Arguments

expand all

A pipelined RDD containing values reduced by key, returned as a RDD object.

Examples

expand all

%% Connect to Spark
sparkProp = containers.Map({'spark.executor.cores'}, {'1'});
conf = matlab.compiler.mlspark.SparkConf('AppName','myApp', ...
                        'Master','local[1]','SparkProperties',sparkProp);
sc = matlab.compiler.mlspark.SparkContext(conf);

%% reduceByKey
inputRDD = sc.parallelize({'A','B','C','A','B'},2);
redRDD= inputRDD.map(@(x)({x,1})).reduceByKey(@(x,y)(x+y),3);

Version History

Introduced in R2016b