ROC curve - how automatically find the most suitable threshold?
109 views (last 30 days)
I have a ROC curve for my data. I would like to find the most suitable threshold for data classification. The threshold should be located in place where False Positive Rate and True Positive Rate are balanced each other. From the interpretation of the ROC curve I know that should choice some threshold which is close to the left upper corner. Is there a way to find this threshold automatically?
To find the best threshold, you first need to define what you mean by "best". Specifically, you need a function that determines the cost of each type of error. In some applications, a false positive is much more costly than a false negative. In other applications, the opposite is true.
After you figure that "cost function" out, then you minimize the cost along your ROC curve.
When you say you have the curve, I assume you have the (X,Y) coordinates of the curve, for example as output by the perfcurve function.
X = false positive rate, and 1-Y = false negative rate.
So, you can do
[minErrDiff,minIdx] = min(X,1-Y)
to find which value is closest to being balanced.
Luke Hubbard on 27 Apr 2021
Edited: Luke Hubbard on 27 Apr 2021
Follow the example for plotting the ROC curve.
[X,Y,T,AUC,OPTROCPT] = perfcurve(labels,scores,posclass);
ThresholdForOptROCpt = T((X==OPTROCPT(1))&(Y==OPTROCPT(2)))
Dario Walter on 16 Jun 2020
There is an output available in the perfcurve functions that returns the value you are looking for:
[X,Y,T,~,OPTROCPT,suby,subnames] = perfcurve(...)
OPTROCPT provides the required value.