Solved – varying classification threshold to produce ROC curves.

classificationmachine learningr

How can I vary classification threshold to produce ROC curves. I am new to R and I wanted to classify in different algorithm. Since the accuracy of ROC plot is high I wanted to change the threshold and try.

I wanted to know how to vary threshold ? What is the appropriate threshold value ?

Best Answer

For an overall explanation of how ROC curves are computed consider this excellent answer: https://stats.stackexchange.com/a/105577/112731

To your question: first, if you want to compare different approaches, comparing their ROC curves and area under curve (AUC) values directly will be a good idea, as those give you overall information about how powerful your approaches are on your problem.

Second: you will need to choose a threshold appropriate for your goal. The tradeoff with this is that you will need to decrease one of the TPR (true positive rate, or sensitivity), or TNR (true negative rate, or specificity) in order to increase the other - there is no way around this$^1$. So, depending on your problem, you might e.g. happen to need a low false positive rate (FPR = 1 - TNR), which in turn will require you to have a high TNR - so this will definitely depend on details of your problem.

Having said this, to choose a threshold you will usually look at both the ROC curve and the distribution of TPR and TNR over the threshold. Those should provide the required information for you to choose a reasonable tradeoff. As you want to do this in R, here's a minimal example of how this could look like:

# example 2-class data and model training
library(caret)
d <- iris[51:150,]
d[,5] <- factor(d[,5])
model <- train(x = d[,c(1,3)], y = d[,5], method = 'lda', metric = 'ROC', trControl=trainControl(method = 'repeatedcv', number = 10, repeats = 10, savePredictions = T, classProbs = T, summary = twoClassSummary))

# plot roc using pROC package
library(pROC)
myRoc <- roc(predictor = model$pred$versicolor, response = model$pred$obs, positive = 'versicolor')
plot(myRoc)

# look at TPR and TNR distribution over threshold
matplot(data.frame(myRoc$sensitivities, myRoc$specificities), x = myRoc$thresholds, type='l', xlab = 'threshold', ylab='TPR, TNR')
legend('bottomright', legend=c('TPR', 'TNR'), lty=1:2, col=1:2)

So in this example, for about equal TPR and TNR, you would want to choose a threshold around 0.5. If you would want to e.g. have very low FPR you would want to choose a lower threshold instead. After choosing a threshold, you can use the predicted class probabilities to immediately determine the predicted class:

label <- ifelse(model$pred$versicolor > threshold, 'versicolor', 'virginica')
confusionMatrix(label, model$pred$obs, positive = 'versicolor')

    Confusion Matrix and Statistics

                Reference
    Prediction   versicolor virginica
    versicolor        470        30
    virginica          30       470

                Accuracy : 0.94            
                    95% CI : (0.9234, 0.9539)
        No Information Rate : 0.5             
        P-Value [Acc > NIR] : <2e-16          

                    Kappa : 0.88            
    Mcnemar's Test P-Value : 1               

                Sensitivity : 0.94            
                Specificity : 0.94            
            Pos Pred Value : 0.94            
            Neg Pred Value : 0.94            
                Prevalence : 0.50            
            Detection Rate : 0.47            
    Detection Prevalence : 0.50            
        Balanced Accuracy : 0.94            

        'Positive' Class : versicolor

$^1$ For completeness: predicted class probabilities from your model are made either a "positive" prediction (usually above the threshold) or a "negative" prediction (usually below the threshold) by this.

Update:

As you just asked for how this would be done with e.g. nnet(), here's a minimal example:

# train model
library(nnet)
X <- nnet(x = iris[51:150,c(1,3)], y = class.ind(factor(iris[51:150,5])), size = 3, linout = F)

# predict class probabilities. you should *not* use the same samples as those used during training here (overfitting)
predicted <- predict(X, iris[51:150,c(1,3)])

# roc curve from class probabilities
library(pROC)
myRoc <- roc(predictor = predicted[,1], response = as.character(iris[51:150,5]), positive = 'versicolor')
plot(myRoc)
[...]

Please note that training on all data will lead to overfitting, so you should instead use techniques like cross validation and resampling (e.g. with the caret package, as shown above - there you would just need to set method='nnet' for using this model, and could provide hyperparameter in the tuneGrid parameter).

Best Answer

Related Solutions

Solved – Advantages of ROC curves

Solved – Decision threshold for a 3-class Naive Bayes ROC curve

Related Question