Why is loss() different from calculating misclassification error using predict()?
3 visualizzazioni (ultimi 30 giorni)
Mostra commenti meno recenti
I am trying to fit an ECOC model to my data but the misclassification calculated from loss() is different to the misclassification calculated by comparing the predicted labels from predict() with the true labels. The same thing happens when using a different model i.e. KNN.
Even though the test dataset has 10 observations, where the misclassification error should be a multiple of 0.1 to my knowledge, loss() outputs 0.8293.
Could someone please help me understand why these are different, i.e. what is going on with the loss() function? And which is more appropriate for evaluating/reporting test set accuracy.
rng(1234)
% define variables
xtrain = rand(100,4); % random numbers, n = 100
xtest = rand(10,4); % random numbers, n = 10
ytrain = ceil(4*rand(100,1)); % 4 classes, n = 100
ytest = ceil(4*rand(10,1)); % 4 classes, n = 10
% train model
mdl1 = fitcecoc(xtrain,ytrain,'Coding','onevsall','Learners','svm');
mdl2 = fitcknn(xtrain,ytrain);
% calculate loss from loss()
loss1mdl1 = loss(mdl1,xtest,ytest);
loss1mdl2 = loss(mdl2,xtest,ytest);
% calculate loss from predict()
loss2mdl1 = 1-mean(predict(mdl1,xtest)==ytest);
loss2mdl2 = 1-mean(predict(mdl2,xtest)==ytest);
0 Commenti
Risposte (2)
Sulaymon Eshkabilov
il 29 Giu 2023
There is a small difference between loss() and predict() fcns. The difference of loss is coming from the calculation of loss fcn value thta considers weight for observation. Otherwise, everything is working as expected:
rng(1234)
% define variables
xtrain = rand(100,4); % random numbers, n = 100
xtest = rand(10,4); % random numbers, n = 10
ytrain = ceil(4*rand(100,1)); % 4 classes, n = 100
ytest = ceil(4*rand(10,1)); % 4 classes, n = 10
% train model
mdl1 = fitcecoc(xtrain,ytrain,'Coding','onevsall','Learners','svm');
mdl2 = fitcknn(xtrain,ytrain);
% calculate loss from loss()
loss1mdl1 = loss(mdl1,xtest,ytest)
loss1mdl2 = loss(mdl2,xtest,ytest)
Y1 = predict(mdl1,xtest);
Y2 = predict(mdl2,xtest);
YC1 = [ytest,Y1] % Two correct answers out of 10, i.e., accuracy is 20%
YC2 = [ytest,Y2] % Three correct answers out of 10, i.e., accuracy 30%
% calculate loss from predict()
loss2mdl1 = 1-mean(predict(mdl1,xtest)==ytest)
loss2mdl2 = 1-mean(predict(mdl2,xtest)==ytest)
0 Commenti
Drew
il 7 Mag 2025
This is because the classreg loss function is normalizing the observation weights so that they sum to the prior probability in the respective class. This can be avoided by providing a custom loss function, as seen in this answer: https://www.mathworks.com/matlabcentral/answers/492062-loss-the-classification-error
0 Commenti
Vedere anche
Categorie
Scopri di più su Classification Ensembles in Help Center e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!