How to get the mean of ROC curves using Matlab?

I met a problem to plot the mean ROC curve of the 10-fold cross-validation using Matlab.
I run the code cvPartition = cvpartition(dataSize,'k', 10); to get 10 fold of training and testing. However, as it randomly choose the number of training and testing. The ROC curve I got from each fold is with different size. In addition, I want to plot the mean ROC of these ten ROC curves I got from the cross-validation. Anyone knows how to do this? I read another post using Python perfectly solve the problem using 1D interpolation. Not sure how to do this in Matlab.
All the FPR and TPR values:
FPR_All =
Columns 1 through 9
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0.2500 0.2000 0 0.1667 0.1667 0.1429 0.3333 0.2000 0
0.5000 0.4000 0.2500 0.3333 0.3333 0.2857 0.6667 0.4000 0.3333
0.7500 0.6000 0.5000 0.5000 0.5000 0.4286 1.0000 0.6000 0.6667
1.0000 0.8000 0.7500 0.6667 0.6667 0.5714 NaN 0.8000 1.0000
NaN 1.0000 1.0000 0.8333 0.8333 0.7143 NaN 1.0000 NaN
NaN NaN NaN 1.0000 1.0000 0.8571 NaN NaN NaN
NaN NaN NaN NaN NaN 1.0000 NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN
Column 10
0
0
0.1429
0.2857
0.4286
0.5714
0.7143
0.8571
1.0000
NaN
TPR_All =
Columns 1 through 9
0 0 0 0 0 0 0 0 0
1.0000 1.0000 0.8333 1.0000 1.0000 1.0000 1.0000 1.0000 0.8571
1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 NaN 1.0000 1.0000
NaN 1.0000 1.0000 1.0000 1.0000 1.0000 NaN 1.0000 NaN
NaN NaN NaN 1.0000 1.0000 1.0000 NaN NaN NaN
NaN NaN NaN NaN NaN 1.0000 NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN
Column 10
0
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
NaN

Risposte (2)

I guess the matrix rows are the ROC-curves, each a number of elements in every column (ans a few NaN after the 1). Then you could take the mean of every row while replacing NaNs using
matrix(isnan(matrix)) = 1; % replace NaN with 1
meanROC = mean(matrix,2); % mean of rows
which will tell MATLAB to take the mean along the 2nd dimension of the matrix, i.e. its rows. The NaNs must be replaces with ones, because otherwise the mean function would return NaN on every row with a NaN. The result is a column vector with the mean ROC-value.

1 Commento

ok,thanks for fast response Erik;Now i using perfcurve function to plot 10 roc curves.
[fpr,tpr,T,AUC] = perfcurve(test_Labelorginalouter, level,1);
plot(fpr,tpr)
i draw roc curve for every fold and plot 10 folds in the same figure , but i cant draw the average of roc curves.

Accedi per commentare.

Ilya
Ilya il 1 Set 2016
Use perfcurve. Take a look at this piece of documentation. Pass true labels and predicted scores as cell arrays, one element per fold. You will get the mean curve and confidence intervals.

5 Commenti

ok,thanks for fast response llya;Now i using perfcurve function to plot 10 roc curves. [fpr,tpr,T,AUC] = perfcurve(test_Labelorginalouter, level,1);
plot(fpr,tpr)
i draw roc curve for every fold and plot 10 folds in the same figure , but i cant draw the average of roc curves. really i want to fix FP rate and average TP rate How can i do that?
I already told you how to do it. Did you read the doc in the link?
Ali Algomae
Ali Algomae il 2 Set 2016
Modificato: Ali Algomae il 2 Set 2016
thanks IIya, i used function like this [X,Y,T,AUC] = perfcurve(test_Labelorginalouter, level,1,'XVals',[0:.05:1]); here we use vertical averaging, is this true? the output for roc like this in picture where the red one is average, is this acceptable?
Yes, here you are using vertical averaging. I have trouble telling where red is, but if red is the smoothest line going midway, it looks sensible.
Thanks Dr IIya for your response, the red curve is the average curve.Now i have this problem, some times all values of TPR are ones while predected label not equal to test label,please help me where is the problem.
model = svmtrain(train_Labelorginalouter,BestFeatureVector,'-s 0 -t 0 -c 100 '); % -c 128 -g .1250 //-c 81920 -g .5
[predicted_label,accuracy,level] = svmpredict(test_Labelorginalouter,testdataouter, model);
AccuracyFoldouter(i)=accuracy(1,1);
% Vals=[0:0.1:1];
test_Labelorginalouter=num2cell(test_Labelorginalouter);
level=num2cell(level);
[X,Y,T,AUC] = perfcurve(test_Labelorginalouter, level,1,'XVals',[0:.05:1]);%[0:.05:1]
TPR(i,:)=(Y(:,1))';
if (all( TPR(i,:)==1))
disp('Check')
pause
end
and i have this warning:
Warning: One of the classes is not present in at least one subsample. You may get NaN confidence bounds. > In perfcurve>xytOneSample (line 1347) In perfcurve>xyt/loopXYT (line 1315) In internal.stats.parallel.smartForSliceout (line 174) In perfcurve>xyt (line 1323) In perfcurve (line 554) In crossvalindAlff2 (line 83)

Accedi per commentare.

Tag

Richiesto:

il 1 Set 2016

Modificato:

il 28 Dic 2017

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by