crossvalidate
Syntax
Description
[
also returns the pipelines learned during k-fold cross validation.cvloss,output1,...,outputN,foldPipelines] = crossvalidate(pipeline,input1,...,inputN,ReturnFoldPipelines=returnFoldPipelines)
[___] = crossvalidate(___,
returns any of the output combinations in the previous syntaxes with additional options
specified by one or more name-value arguments. For example,
Name=Value) specifies to perform holdout
validation with a test set fraction of Holdout=0.20.2.
Examples
Create a pipeline with four components to impute missing observations, normalize data, perform principal component analysis, and perform ECOC classification.
impute = observationImputerComponent; normalize = normalizerComponent; pca = pcaComponent(NumComponents=3); ecoc = classificationECOCComponent; pipe = series(impute,normalize,pca,ecoc);
Load the carbig data set. Store the acceleration, displacement,
and horsepower data as predictor data in the table X. Update the
response variable Origin to categorize the cars based on whether they
were made in the USA, and store this variable in the table Y.
load carbig X = table(Acceleration,Displacement,Horsepower); Origin = categorical(cellstr(Origin)); Origin = mergecats(Origin,["France","Japan","Germany", ... "Sweden","Italy","England"],"NotUSA"); Y = table(Origin);
Cross-validate the pipeline using ten-fold cross-validation.
rng("default")
cvLoss = crossvalidate(pipe,X,Y,KFold=10)cvLoss = 0.1232
Input Arguments
Pipeline to cross-validate, specified as a LearningPipeline. pipeline must contain one of the
following supervised learning components.
Classification Model Components
| Component | Purpose |
|---|---|
classificationDiscriminantComponent | Discriminant analysis classification |
classificationECOCComponent | Multiclass classification using error-correcting output codes (ECOC) model |
classificationEnsembleComponent | Ensemble classification |
classificationGAMComponent | Binary classification using generalized additive model (GAM) |
classificationKernelComponent | Classification using Gaussian kernel with random feature expansion |
classificationKNNComponent | Classification using k-nearest neighbor model |
classificationLinearComponent | Binary classification of high-dimensional data using a linear model |
classificationNaiveBayesComponent | Multiclass classification using a naive Bayes model |
classificationNeuralNetworkComponent | Classification using a neural network model |
classificationSVMComponent | One-class and binary classification using a support vector machine (SVM) classifier |
classificationTreeComponent | Decision tree classifier |
Regression Model Components
| Component | Purpose |
|---|---|
regressionEnsembleComponent | Ensemble regression |
regressionGAMComponent | Regression using generalized additive model (GAM) |
regressionGPComponent | Gaussian process regression |
regressionKernelComponent | Kernel regression using explicit feature expansion |
regressionLinearComponent | Linear regression |
regressionNeuralNetworkComponent | Neural network regression |
regressionSVMComponent | Regression using a support vector machine (SVM) |
regressionTreeComponent | Decision tree regression |
Input data required by pipeline,
specified as a table. Input data can be predictor data, response values, observation
weights, and so on. The order of the inputs 1, …, N must match the
order of the pipeline inputs, as listed in the Inputs property. You
can see the identity and the order of the inputs using
pipeline.Inputs.
Data Types: table
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN, where Name is
the argument name and Value is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Example: crossvalidate(pipe,X,Y,KFold=10) specifies to cross validate
the pipeline using ten-fold cross-validation.
Number of folds for k-fold cross-validation, specified as a positive integer scalar.
When you specify a KFold value of k,
crossvalidate randomly partitions the data into
k sets. For each set, the functions reserves that set as test
data, and learns the pipeline using the other
k-1 sets. crossvalidate
runs the k pipelines with the corresponding test sets, and uses the
results to compute cvloss.
You can specify only one of these three name-value arguments: Holdout,
KFold, and Partition.
Example: KFold=10
Data Types: single | double
Cross-validation partition, specified as a cvpartition object. The partition object specifies the type of
cross-validation and the indexing for the training and test sets.
You cannot specify both Partition and Stratify.
Instead, directly specify a stratified partition when you create the partition
object.
You can specify only one of these three name-value arguments: Holdout,
KFold, and
Partition.
Example: Partition=cvpartition(X,Holdout=0.2)
Fraction of data for holdout validation, specified as a scalar value in the range
(0,1).
crossvalidate randomly selects and reserves the proportion of
observations specified by Holdout as test data, then learns the
pipeline using the remaining data. Finally, the function uses the test data along with
the learned pipeline to compute cvloss.
You can specify only one of these three name-value arguments:
Holdout, KFold, and
Partition.
Example: Holdout=0.1
Data Types: single | double
Loss function, specified as a function handle or one of the values in this table.
| Value | Description |
|---|---|
"binodeviance" | Binomial deviance |
"classifcost" | Observed misclassification cost |
"classiferror" | Misclassified rate in decimal |
"crossentropy" | Cross-entropy loss |
"epsiloninsensitive" | Epsilon-insensitive loss |
"exponential" | Exponential loss |
"hinge" | Hinge loss |
"logit" | Logistic loss |
"mincost" | Minimal expected misclassification cost (for classification scores that are posterior probabilities) |
"mse" | Mean squared error |
"quadratic" | Quadratic loss |
To specify a custom loss function, use function handle notation.
If pipeline
contains a classification component and you specify a value for the
Prior property of that component,
crossvalidate normalizes the observation weights used to
compute loss so that they sum to the corresponding prior class probability. Otherwise,
crossvalidate does not normalize observation weights.
LossFun must be a value accepted by the
LossFun property of the supervised learning component in
pipeline. By default, crossvalidate uses
the loss function specified in the LossFun property of the
supervised learning component in pipeline.
Example: LossFun="classiferror"
Data Types: char | string | function_handle
Indicator for stratification, specified as 1
(true) or 0 (false).
Stratification is only supported when pipeline
contains a classification learning component. When Stratify is
true, each cross-validation set maintains the same proportion of
classes as the original dataset.
The default value is true if pipeline
contains a classification component, and false if
pipeline contains a regression component.
Example: Stratify=false
Data Types: logical
Input tag of the data used for stratification, specified as a positive numeric
scalar. crossvalidate uses the data specified by
StratificationInput to divide the data into a stratified
partition.
Stratification is only supported when pipeline
contains a classification learning component.
Example: StratificationInput=3
Data Types: single | double
Indicator to return the pipelines learned during k-fold cross-validation,
specified as 0 (false) or 1
(true).
If ReturnFoldPipelines is true,
crossvalidate returns the learned pipelines as foldPipelines.
Example: ReturnFoldPipelines=true
Data Types: logical
Output Arguments
Cross-validation loss, specified as a numeric scalar. The function determines
cvloss by computing the aggregate loss from all test data in the
partition.
For k-fold cross-validation, crossvalidate combines the test
data from each fold to compute cvloss. For holdout
cross-validation, crossvalidate uses the test set to compute
cvloss.
Output data computed by pipeline
based on the input data, returned as separate variables.
Pipelines learned during k-fold cross-validation, specified as a cell array of
LearningPipeline objects.
Version History
Introduced in R2026a
See Also
LearningPipeline | learn | run
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Seleziona un sito web
Seleziona un sito web per visualizzare contenuto tradotto dove disponibile e vedere eventi e offerte locali. In base alla tua area geografica, ti consigliamo di selezionare: .
Puoi anche selezionare un sito web dal seguente elenco:
Come ottenere le migliori prestazioni del sito
Per ottenere le migliori prestazioni del sito, seleziona il sito cinese (in cinese o in inglese). I siti MathWorks per gli altri paesi non sono ottimizzati per essere visitati dalla tua area geografica.
Americhe
- América Latina (Español)
- Canada (English)
- United States (English)
Europa
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)