Calculating loss when cvpartition has been used within HyperparameterOptimizationOptions in fitcnb

Question

Richard il 1 Dic 2023

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/2055074-calculating-loss-when-cvpartition-has-been-used-within-hyperparameteroptimizationoptions-in-fitcnb

Risposto: Avadhoot il 21 Dic 2023

% Basic set up with cross validation
Mdl=fitcnb(featuresTrain,targetTrain)
CVMdl=crossval(Mdl)
l=kfoldLoss(Mdl)
% Now, I set up stratification for cross validation
c = cvpartition(targetTrain,"KFold",10)
% Default Naive Bayes, with stratification
Mdl1_0=fitcnb(featuresTrain,targetTrain,...
    'CVPartition',c)
loss1_0=kfoldLoss(Mdl1_0) % all OK. I get an answer
% output: 0.4ish
% Now, I optimize
Mdl1_2=fitcnb(featuresTrain,targetTrain,...
    'OptimizeHyperparameters','auto',...
    'HyperparameterOptimizationOptions',struct(...
    'CVPartition',c,...
    'AcquisitionFunctionName','expected-improvement-plus')) % for reproducibility
loss1_2 = kfoldLoss(Mdl1_2) % Error: Incorrect number or types of inputs or outputs for function kfoldLoss.
loss1_2_ = loss(Mdl1_2,featuresTrain,targetTrain) % Works, but the answer is considerably smaller than I expected
% output: 0.3ish
% Now, I test
loss1 = loss(Mdl1_2,featuresTest,targetTest)
% output: back to 0.4ish

I am attempting to stratify my cross validation using cvpartition.

This is fine at first. I use kfoldLoss and get a reasonable answer.

However, then I try to optimize and use cvpartition within HyperparameterOptimizationOptions. Now, I am unable to use kfoldLoss() (error above). Is this because the output Mdl1_2 is just one model that has already been optimised with cross validation, whereas Mdl1_1 is a cross validated model with essentially 10 outputs?

Assuming this might be the case, I use loss() instead, but I get a value a lot lower than I expected, as demonstrated by the loss() on my test data going back up again.

Maybe I've done it all right and this is just that the training set has been stratified but the test set doesn't necessarily have the same distribution? My data set is quite small (670 items) and these results came from a 85:15 train:test split.

Thank you.

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Avadhoot il 21 Dic 2023

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/2055074-calculating-loss-when-cvpartition-has-been-used-within-hyperparameteroptimizationoptions-in-fitcnb#answer_1375252

Apri in MATLAB Online

Hi Richard,

I understand that you are encountering an issue when trying to calculate the K-fold cross-validated loss for your model. The approach you've taken is correct. However, the error you're experiencing arises because "Mdl1_2" is a "ClassificationNaiveBayes" object, while the "kfoldLoss" function requires a "ClassificationPartitionedModel" object. This can be achieved by manually applying cross-validation after the fitting step. But there is a more straightforward method:

After hyperparameter tuning, the cross-validated loss is automatically calculated and stored within the "HyperparameterOptimizationResults" property of the model. You can retrieve it using the following line of code:

cvLoss = Mdl1_2.HyperparameterOptimizationResults.MinObjective;

Once you have the cross-validated loss, you can proceed to calculate the loss on the test set using the "loss" function.

For additional details on the "kfoldLoss", "fitcnb", and "crossval" functions, refer to the following documentation links:

"kfoldLoss" function: https://www.mathworks.com/help/stats/classreg.learning.partition.regressionpartitionedmodel.kfoldloss.html
"fitcnb" function: https://www.mathworks.com/help/stats/fitcnb.html
"crossval" function: https://www.mathworks.com/help/stats/crossval.html

I hope it helps.

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Calculating loss when cvpartition has been used within HyperparameterOptimizationOptions in fitcnb

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposte (1)

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

Calculating loss when cvpartition has been used within Hyperparam​eterOptimi​zationOpti​ons in fitcnb

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposte (1)

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

Calculating loss when cvpartition has been used within HyperparameterOptimizationOptions in fitcnb

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti