Contenuto principale

Regression Learner

Train regression models to predict data using supervised machine learning

Description

The Regression Learner app trains regression models to predict data. Using this app, you can explore your data, select features, specify validation schemes, train models and optimize hyperparameters, assess results, and investigate how specific predictors contribute to model predictions. Perform automated training to search for the best regression model type, including linear regression models, regression trees, Gaussian process regression models, support vector machines, efficiently trained linear regression models, kernel approximation models, ensembles of regression trees, and neural network regression models. To compare models, use the metric results table and view results plots in the app.

Perform supervised machine learning by supplying a known set of observations of input data (predictors) and known responses. Use the observations to train a model that generates predicted responses for new input data. You can then check model performance using a test data set. To understand how the model uses predictors to make predictions, use global and local interpretability tools, such as partial dependence plots, LIME values, and Shapley values.

To use the trained model with new data, you can export the model to the workspace, Simulink®, and MATLAB® Production Server™. You can generate MATLAB code to recreate the trained model outside of the app and explore programmatic regression and further customization of the model training workflow. Export the model training code to Experiment Manager to perform additional tasks, such as changing the training data, adjusting hyperparameter search ranges, and running custom training experiments.

Tip

To get started, in the Models section of the Learn tab, try All Quick-To-Train to train a selection of models. See Automated Regression Model Training.

Required Products

  • MATLAB

  • Statistics and Machine Learning Toolbox™

Regression Learner app

Open the Regression Learner App

  • MATLAB Toolstrip: On the Apps tab, under Machine Learning and Deep Learning, click the app icon.

  • MATLAB command prompt: Enter regressionLearner.

Programmatic Use

regressionLearner opens the Regression Learner app or brings focus to the app if it is already open.

regressionLearner(Tbl,ResponseVarName) opens the Regression Learner app and populates the New Session from Arguments dialog box with the data contained in the table Tbl. The ResponseVarName argument, specified as a character vector or string scalar, is the name of the variable in Tbl that contains the response values. The remaining variables in Tbl are the predictor variables.

regressionLearner(Tbl,Y) opens the Regression Learner app and populates the New Session from Arguments dialog box with the predictor variables in the table Tbl and the response values in the numeric vector Y.

regressionLearner(X,Y) opens the Regression Learner app and populates the New Session from Arguments dialog box with the n-by-p predictor matrix X and the n response values in the vector Y. Each row of X corresponds to one observation, and each column corresponds to one variable. The length of Y and the number of rows of X must be equal.

regressionLearner(___,Name,Value) specifies cross-validation options using one or more of the following name-value arguments in addition to any of the input argument combinations in the previous syntaxes. For example, you can specify KFold=10 to use a 10-fold cross-validation scheme.

  • CrossVal, specified as "on" (default) or "off", is the cross-validation flag. If you specify "on", the app uses 5-fold cross-validation. If you specify "off", the app uses resubstitution validation.

    You can override the CrossVal cross-validation setting by using the Holdout or KFold name-value argument. You can specify only one of these three arguments at a time.

  • Holdout, specified as a numeric scalar in the range [0.05,0.5], is the fraction of the training data set used for holdout validation. The training data set is the data in Tbl or X that is not set aside for testing.

  • KFold, specified as a positive integer in the range [2,50], is the number of folds to use for cross-validation.

  • TestDataFraction, specified as a numeric scalar in the range [0,0.5], is the fraction of the data in Tbl or X that is set aside for testing.

  • ValidationPartition, specified as a cvpartition object, that defines the validation scheme and the indexing for the validation sets. The indices in the object correspond to the rows in Tbl or X. When you specify ValidationPartition:

    For more information about validation schemes, see Select Validation Scheme in Classification Learner or Regression Learner.

  • TestPartition, specified as a cvpartition object, that defines the indexing for the rows in Tbl or X to be set aside for testing. When you specify TestPartition:

    • The Type property of the cvpartition object must be 'holdout'.

    • You cannot specify TestDataFraction.

    • You can specify only one of these four arguments: CrossVal, Holdout, KFold, or ValidationPartition.

    For more information about test data sets, see Test Trained Models in Classification Learner or Regression Learner.

regressionLearner(filename) opens the Regression Learner app with the previously saved session in filename. The filename argument, specified as a character vector or string scalar, must include the name of a Regression Learner session file and the path to the file, if it is not in the current folder. The file must have the extension .mat.

Limitations

  • Regression Learner does not support model deployment to MATLAB Production Server in MATLAB Online™.

Version History

Introduced in R2017a

expand all