This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English version of the page.

Note: This page has been translated by MathWorks. Click here to see
To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

crossval

Class: RegressionSVM

Cross-validated support vector machine regression model

Syntax

CVMdl = crossval(mdl)
CVMdl = crossval(mdl,Name,Value)

Description

CVMdl = crossval(mdl) returns a cross-validated (partitioned) support vector machine regression model, CVMdl, from a trained SVM regression model, mdl.

CVMdl = crossval(mdl,Name,Value) returns a cross-validated model with additional options specified by one or more Name,Value pair arguments.

Input Arguments

expand all

Full, trained SVM regression model, specified as a RegressionSVM model returned by fitrsvm.

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Cross-validation partition, specified as the comma-separated pair consisting of 'CVPartition' and a cvpartition partition object created by cvpartition. The partition object specifies the type of cross-validation and the indexing for the training and validation sets.

To create a cross-validated model, you can use one of these four name-value pair arguments only: CVPartition, Holdout, KFold, or Leaveout.

Example: Suppose you create a random partition for 5-fold cross-validation on 500 observations by using cvp = cvpartition(500,'KFold',5). Then, you can specify the cross-validated model by using 'CVPartition',cvp.

Fraction of the data used for holdout validation, specified as the comma-separated pair consisting of 'Holdout' and a scalar value in the range (0,1). If you specify 'Holdout',p, then the software completes these steps:

  1. Randomly select and reserve p*100% of the data as validation data, and train the model using the rest of the data.

  2. Store the compact, trained model in the Trained property of the cross-validated model.

To create a cross-validated model, you can use one of these four name-value pair arguments only: CVPartition, Holdout, KFold, or Leaveout.

Example: 'Holdout',0.1

Data Types: double | single

Number of folds to use in a cross-validated model, specified as the comma-separated pair consisting of 'KFold' and a positive integer value greater than 1. If you specify 'KFold',k, then the software completes these steps:

  1. Randomly partition the data into k sets.

  2. For each set, reserve the set as validation data, and train the model using the other k – 1 sets.

  3. Store the k compact, trained models in the cells of a k-by-1 cell vector in the Trained property of the cross-validated model.

To create a cross-validated model, you can use one of these four name-value pair arguments only: CVPartition, Holdout, KFold, or Leaveout.

Example: 'KFold',5

Data Types: single | double

Leave-one-out cross-validation flag, specified as the comma-separated pair consisting of 'Leaveout' and 'on' or 'off'. If you specify 'Leaveout','on', then, for each of the n observations (where n is the number of observations excluding missing observations, specified in the NumObservations property of the model), the software completes these steps:

  1. Reserve the observation as validation data, and train the model using the other n – 1 observations.

  2. Store the n compact, trained models in the cells of an n-by-1 cell vector in the Trained property of the cross-validated model.

To create a cross-validated model, you can use one of these four name-value pair arguments only: CVPartition, Holdout, KFold, or Leaveout.

Example: 'Leaveout','on'

Output Arguments

expand all

Cross-validated SVM regression model, returned as a RegressionPartitionedSVM model.

Examples

expand all

This example shows how to train a cross-validated SVM regression model using crossval.

This example uses the abalone data from the UCI Machine Learning Repository. Download the data and save it in your current folder with the name 'abalone.data'. Read the data into a table.

tbl = readtable('abalone.data','Filetype','text','ReadVariableNames',false);
rng default  % for reproducibility

The sample data contains 4177 observations. All the predictor variables are continuous except for sex, which is a categorical variable with possible values 'M' (for males), 'F' (for females), and 'I' (for infants). The goal is to predict the number of rings on the abalone and determine its age using physical measurements.

Train an SVM regression model, using a Gaussian kernel function with a kernel scale equal to 2.2. Standardize the data.

mdl = fitrsvm(tbl,'Var9','KernelFunction','gaussian','KernelScale',2.2,'Standardize',true);

mdl is a trained RegressionSVM regression model.

Cross validate the model using 10-fold cross validation.

CVMdl = crossval(mdl)
CVMdl = 

  classreg.learning.partition.RegressionPartitionedSVM
      CrossValidatedModel: 'SVM'
           PredictorNames: {1x8 cell}
    CategoricalPredictors: 1
             ResponseName: 'Var9'
          NumObservations: 4177
                    KFold: 10
                Partition: [1x1 cvpartition]
        ResponseTransform: 'none'


  Properties, Methods

CVMdl is a RegressionPartitionedSVM cross-validated regression model. The software:

1. Randomly partitions the data into 10 equally sized sets.

2. Trains an SVM regression model on nine of the 10 sets.

3. Repeats steps 1 and 2 k = 10 times. It leaves out one of the partitions each time, and trains on the other nine partitions.

4. Combines generalization statistics for each fold.

Calculate the resubstitution loss for the cross-validated model.

loss = kfoldLoss(CVMdl)
loss =

    4.5712

This example shows how to specify a holdout proportion for training a cross-validated SVM regression model.

This example uses the abalone data from the UCI Machine Learning Repository. Download the data and save it in your current folder with the name 'abalone.data'. Read the data into a table.

tbl = readtable('abalone.data','Filetype','text','ReadVariableNames',false);
rng default  % for reproducibility

The sample data contains 4177 observations. All the predictor variables are continuous except for sex, which is a categorical variable with possible values 'M' (for males), 'F' (for females), and 'I' (for infants). The goal is to predict the number of rings on the abalone and determine its age using physical measurements.

Train an SVM regression model, using a Gaussian kernel function with an automatic kernel scale. Standardize the data.

mdl = fitrsvm(tbl,'Var9','KernelFunction','gaussian','KernelScale','auto','Standardize',true);

mdl is a trained RegressionSVM regression model.

Cross validate the regression model by specifying a 10% holdout sample.

CVMdl = crossval(mdl,'Holdout',0.1)
CVMdl = 

  classreg.learning.partition.RegressionPartitionedSVM
      CrossValidatedModel: 'SVM'
           PredictorNames: {1x8 cell}
    CategoricalPredictors: 1
             ResponseName: 'Var9'
          NumObservations: 4177
                    KFold: 1
                Partition: [1x1 cvpartition]
        ResponseTransform: 'none'


  Properties, Methods

CVMdl is a RegressionPartitionedSVM model object.

Calculate the resubstitution loss for the cross-validated model.

loss = kfoldLoss(CVMdl)
loss =

    5.2499

Alternatives

Instead of training an SVM regression model and then cross-validating it, you can create a cross-validated model directly by using fitrsvm and specifying any of these name-value pair arguments: 'CrossVal', 'CVPartition', 'Holdout', 'Leaveout', or 'KFold'.

References

[1] Nash, W.J., T. L. Sellers, S. R. Talbot, A. J. Cawthorn, and W. B. Ford. The Population Biology of Abalone (Haliotis species) in Tasmania. I. Blacklip Abalone (H. rubra) from the North Coast and Islands of Bass Strait, Sea Fisheries Division, Technical Report No. 48, 1994.

[2] Waugh, S. Extending and benchmarking Cascade-Correlation, Ph.D. thesis, Computer Science Department, University of Tasmania, 1995.

[3] Clark, D., Z. Schreter, A. Adams. A Quantitative Comparison of Dystal and Backpropagation, submitted to the Australian Conference on Neural Networks, 1996.

[4] Lichman, M. UCI Machine Learning Repository, [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.

Introduced in R2015b