What do the "scores" mean that result from application of a model from the Classification Learner App to new data?

What do the "scores" mean that result from application of a model from the Classification Learner App to new data? They really don't seem to make any sense. The class selected doesn't equate to the highest or lowest score in the resulting scores matrix so I am wondering what they mean in some substantive sense?

Risposte (1)

I'll assume you are following the workflow here: https://www.mathworks.com/help/stats/export-classification-model-for-use-with-new-data.html and running something like
[yfit,scores] = C.predictFcn(T)
The precise meaning of the scores depends on the type of classifier that you have trained. For most classifier types, the predicted class will correspond to the class with the highest score. However, the KNN, Naive Bayes, and Discrimant classifiers can be different when non-default misclassification costs are provided. For example, for a KNN classifier, the scores correspond to posterior probabilities which do not yet take misclassification costs into account. For a KNN classifier, the class with the minimum expected cost (rather than the maximum posterior probability score) will correspond to the predicted class. See https://www.mathworks.com/help/stats/classificationknn.predict.html , which indicates that the KNN predict function provides a third output which constains the expected cost after taking into account the misclassification costs:
% Example of predict function for KNN, with third output, the expected cost
[label,score,cost] = predict(mdl,X)
So, in the case of a KNN classifier, you will want to add a third output to your predict function in order to look at the expected costs.
If this doesn't answer your question, perhaps you can provide more details such as the classifier type that you trained and an example result.
If this answer helps you, please remember to accept the answer.

4 Commenti

Thanks for the response, but the results I am getting are still not making much sense. First, a simple question, what is the order that the "scores" are given as? They come out as a simple numeric array so I am not sure if the columns correspond to the classes in the order in which they were presented in the input table or put into alphabetical order... neither seems to work (i.e., from your response, the highest numeric value in the scores table should correspond to the predicted response, but while that is sometimes the case, it is not always the case).
I tried several classifiers and got the best results with the Wide Neural Network, Cubic SVM, and Linear Discriminant Analysis. I don't want to put down the full 159 row response from all three here, but I'll attach a reformatted table of the first 20 from the Cubic SVM with column 1 = samples, column 2 = predicted response, 3-10 = scores. I'll leave the first row as what I inserted with the columns labelled in the order as they were presented in the input table (the training table for the classifier).
In the csv file that you sent, the predicted class of the Cubic SVM shown in column 2 is also the class with the highest score, if the header classNames that you provided are sorted alphabetically to interpret the order of the scores that are provided in columns 3 through the end. So, this amounts to ignoring the order of the classNames shown in the first row of the attached csv file.
t=readtable("partial_SVM_results.csv");
% sort the provided className headers alphabetically
classNames=sort(t.Properties.VariableDescriptions(3:end));
for i=1:size(t,1)
[m,idx]=max(t{i,3:end});
str=sprintf("%d,%s,%s",i,string(t{i,2}),classNames{idx});
disp(str);
end
This results in the output of these triples which repressent: (index, predicted class from column 2, predicted class from highest score with order interpreted alphabetically)
1,Hematitic,Hematitic
2,Hematitic,Hematitic
3,Marker Band,Marker Band
4,Foreign Stones,Foreign Stones
5,Foreign Stones,Foreign Stones
6,Above MB,Above MB
7,Foreign Stones,Foreign Stones
8,Hematitic,Hematitic
9,Hematitic,Hematitic
10,Below MB,Below MB
11,Marker Band,Marker Band
12,Below MB,Below MB
13,Below MB,Below MB
14,Below MB,Below MB
15,Below MB,Below MB
16,MB Tailings,MB Tailings
17,Foreign Stones,Foreign Stones
18,Hematitic,Hematitic
19,Hematitic,Hematitic
That works if the titles of classes are rearranged for the scores into a more-or-less alphabetic order rather than the order in which they were input. I.e., for this example, if they're arranged: Above MB, Below MB, Foreign Stones, Hematitic, MB Tailings, Marker Band, Meteorites, Veins Bright . Is that what is expected? For the results to come out in a rearranged order?
The order of the output scores is indicated in the ClassNames property of the model.
For example, for a CubicSVM trained on fisheriris in Classification Learner, then exported as "trainedModel", here is the ClassNames property:
>> trainedModel.ClassificationSVM.ClassNames
ans =
3×1 cell array
{'setosa' }
{'versicolor'}
{'virginica' }

Accedi per commentare.

Prodotti

Release

R2022a

Richiesto:

il 1 Ott 2023

Commentato:

il 6 Ott 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by