MATLAB Hadoop and Spark

MATLAB® provides numerous capabilities for processing big data that scales from a single workstation to compute clusters. This includes accessing data from Hadoop Distributed File System (HDFS) and running algorithms on Apache Spark.

With MATLAB, you can:

  • Access data from HDFS to explore, visualize, and prototype analytics on your local workstation
  • Analyze data, create accurate predictive models, and run MATLAB algorithms where your data lives using Hadoop and Spark

Use MATLAB with data in HDFS and on Spark

Tall arrays allow you to use MATLAB algorithms with big data on your local workstation and on Hadoop with Spark using the familiar and intuitive MATLAB language.

You can manipulate and clean your data and perform machine learning, regression, and various statistical analyses.

MATLAB is certified for use with the Cloudera platform.

MATLAB Tall Arrays in Action