Main Content

kfoldMargin

Classification margins for cross-validated kernel classification model

Description

margin = kfoldMargin(CVMdl) returns the classification margins obtained by the cross-validated, binary kernel model (ClassificationPartitionedKernel) CVMdl. For every fold, kfoldMargin computes the classification margins for validation-fold observations using a model trained on training-fold observations.

example

Examples

collapse all

Load the ionosphere data set. This data set has 34 predictors and 351 binary responses for radar returns, which are labeled as either bad ('b') or good ('g').

load ionosphere

Cross-validate a binary kernel classification model using the data.

CVMdl = fitckernel(X,Y,'Crossval','on')
CVMdl = 
  ClassificationPartitionedKernel
    CrossValidatedModel: 'Kernel'
           ResponseName: 'Y'
        NumObservations: 351
                  KFold: 10
              Partition: [1×1 cvpartition]
             ClassNames: {'b'  'g'}
         ScoreTransform: 'none'


  Properties, Methods

CVMdl is a ClassificationPartitionedKernel model. By default, the software implements 10-fold cross-validation. To specify a different number of folds, use the 'KFold' name-value pair argument instead of 'Crossval'.

Estimate the classification margins for validation-fold observations.

m = kfoldMargin(CVMdl);
size(m)
ans = 1×2

   351     1

m is a 351-by-1 vector. m(j) is the classification margin for observation j.

Plot the k-fold margins using a box plot.

boxplot(m,'Labels','All Observations')
title('Distribution of Margins')

Figure contains an axes object. The axes object with title Distribution of Margins contains 7 objects of type line. One or more of the lines displays its values using only markers

Perform feature selection by comparing k-fold margins from multiple models. Based solely on this criterion, the classifier with the greatest margins is the best classifier.

Load the ionosphere data set. This data set has 34 predictors and 351 binary responses for radar returns, which are labeled either bad ('b') or good ('g').

load ionosphere

Randomly choose 10% of the predictor variables.

rng(1); % For reproducibility
p = size(X,2); % Number of predictors
idxPart = randsample(p,ceil(0.1*p));

Cross-validate two binary kernel classification models: one that uses all of the predictors, and one that uses 10% of the predictors.

CVMdl = fitckernel(X,Y,'CrossVal','on');
PCVMdl = fitckernel(X(:,idxPart),Y,'CrossVal','on');

CVMdl and PCVMdl are ClassificationPartitionedKernel models. By default, the software implements 10-fold cross-validation. To specify a different number of folds, use the 'KFold' name-value pair argument instead of 'Crossval'.

Estimate the k-fold margins for each classifier.

fullMargins = kfoldMargin(CVMdl);
partMargins = kfoldMargin(PCVMdl);

Plot the distribution of the margin sets using box plots.

boxplot([fullMargins partMargins], ...
    'Labels',{'All Predictors','10% of the Predictors'});
title('Distribution of Margins')

Figure contains an axes object. The axes object with title Distribution of Margins contains 14 objects of type line. One or more of the lines displays its values using only markers

The quartiles of the PCVMdl margin distribution are situated higher than the quartiles of the CVMdl margin distribution, indicating that the PCVMdl model is the better classifier.

Input Arguments

collapse all

Cross-validated, binary kernel classification model, specified as a ClassificationPartitionedKernel model object. You can create a ClassificationPartitionedKernel model by using fitckernel and specifying any one of the cross-validation name-value pair arguments.

To obtain estimates, kfoldMargin applies the same data used to cross-validate the kernel classification model (X and Y).

Output Arguments

collapse all

Classification margins, returned as a numeric vector. margin is an n-by-1 vector, where each row is the margin of the corresponding observation and n is the number of observations (size(CVMdl.Y,1)).

More About

collapse all

Extended Capabilities

expand all

Version History

Introduced in R2018b

expand all