MATLAB Answers

Matlab: Error using classreg.l​earning.Fi​tTemplate/​fit with hyperparameter optimization of SVM

17 views (last 30 days)
Nick Zadeh
Nick Zadeh on 25 Oct 2018
Commented: Nick Zadeh on 26 Oct 2018
I am using Bayesian optimization (bayesopt function) in Matlab for hyperparameter optimization of SVM classifier. The optimization goal is to minimize 10-fold cross validation error. Here is the code that I use:
KernelFlag = 1;
c = cvpartition(size(XTrain,1),'KFold',10);
sigma = optimizableVariable('sigma',[1e-5,1e5],'Transform','log');
box = optimizableVariable('box',[1e-5,1e5],'Transform','log');
polyOrder = optimizableVariable('polyOrder',[2,4]);
fun = @(z)mysvmfunTest(z,XTrain,yTrain,c,classNames,KernelFlag);
results = bayesopt(fun,[sigma,box,polyOrder],'IsObjectiveDeterministic',true,...
'PlotFcn',{@plotMinObjective},...
'AcquisitionFunctionName','expected-improvement-plus');
and mysvmfunTest:
function [objective] = mysvmfunTest(z,X,Y,c,classNames,KernelFlag)
if KernelFlag == 1
t = templateSVM('Standardize',1,'KernelFunction','RBF',...
'BoxConstraint',z.box,'KernelScale',z.sigma,'RemoveDuplicates',true);
elseif KernelFlag == 2
t = templateSVM('Standardize',1,'KernelFunction','polynomial',...
'BoxConstraint',z.box,'KernelScale',z.sigma,'PolynomialOrder',z.polyOrder,...
'RemoveDuplicates',true);
else
t = templateSVM('Standardize',1,'KernelFunction','linear',...
'BoxConstraint',z.box,'KernelScale',z.sigma,...
'RemoveDuplicates',true);
end
SVMModel = fitcecoc(X,Y,'Learners',t,'ClassNames',classNames);
cvModel = crossval(SVMModel,'CVPartition',c);
objective = kfoldLoss(cvModel);
I have used this code before, with different datasets. But, lately when I try to use it on a new dataset, it throws me an error:
Error using classreg.learning.FitTemplate/fit (line 249) You passed a cvpartition object for 27152 observations, but the input data have only 10395 observations. Some observations may have been removed because they have NaN values for all predictors, missing response values or zero weights. When cross-validating an existing object, consider using the RowsUsed property to determine what size partition is required.
I checked all the data, there is no nan, or missing values in my data. I even removed all the samples which have any feature between 0 and .01 (all my features are positive). Still have the same problem and get the same error. I guess the error is due to the existence of samples that are perhaps too close, resulting into removal of many of the observations, but I am not sure that is the case. Any idea where this error might come from or any suggestion how I can solve this issue?

  0 Comments

Sign in to comment.

Accepted Answer

Ilya
Ilya on 26 Oct 2018
You are passing ClassNames to fitcecoc - are your ClassNames a subset of all class names you have in yTrain?
Train one ECOC model using
SVMModel = fitcecoc(XTrain,yTrain,'Learners',t,'ClassNames',classNames);
and look at the size of property X in SVMModel. Does it have as many rows as XTrain does?

More Answers (1)

Don Mathis
Don Mathis on 26 Oct 2018
Edited: Don Mathis on 26 Oct 2018
Maybe your use of 'RemoveDuplicates' is causing observations to be removed?
I ran your code on some synthetic data that has no duplicates in XTrain and it works fine:
XTrain = rand(1000,10);
yTrain = categorical(round(XTrain(:,1)*3));
classNames = categories(yTrain);
KernelFlag = 1;
c = cvpartition(size(XTrain,1),'KFold',10);
sigma = optimizableVariable('sigma',[1e-5,1e5],'Transform','log');
box = optimizableVariable('box',[1e-5,1e5],'Transform','log');
polyOrder = optimizableVariable('polyOrder',[2,4]);
fun = @(z)mysvmfunTest(z,XTrain,yTrain,c,classNames,KernelFlag);
results = bayesopt(fun,[sigma,box,polyOrder],'IsObjectiveDeterministic',true,...
'PlotFcn',{@plotMinObjective},...
'AcquisitionFunctionName','expected-improvement-plus');
function [objective] = mysvmfunTest(z,X,Y,c,classNames,KernelFlag)
if KernelFlag == 1
t = templateSVM('Standardize',1,'KernelFunction','RBF',...
'BoxConstraint',z.box,'KernelScale',z.sigma,'RemoveDuplicates',true);
elseif KernelFlag == 2
t = templateSVM('Standardize',1,'KernelFunction','polynomial',...
'BoxConstraint',z.box,'KernelScale',z.sigma,'PolynomialOrder',z.polyOrder,...
'RemoveDuplicates',true);
else
t = templateSVM('Standardize',1,'KernelFunction','linear',...
'BoxConstraint',z.box,'KernelScale',z.sigma,...
'RemoveDuplicates',true);
end
SVMModel = fitcecoc(X,Y,'Learners',t,'ClassNames',classNames);
cvModel = crossval(SVMModel,'CVPartition',c);
objective = kfoldLoss(cvModel);
end
By the way, it's probably best to declare polyOrder to be an integer:
polyOrder = optimizableVariable('polyOrder',[2,4],'Type','integer');

  1 Comment

Nick Zadeh
Nick Zadeh on 26 Oct 2018
Thank you for your response Don! I tried removing ' RemoveDuplicates',true, but I still get the same error message. As I mentioned before, I used this code on different datasets and I never got any error. Do you have any idea what might cause the problem in this case? Is there any part of this code that automatically removes samples that are too close? Or do you know of any existing bug?

Sign in to comment.

Sign in to answer this question.


Translated by