ROC Curve – Why Does It Have a Sharp Edge?

predictive-modelsrrandom forestroc

I was working on a random forest model in R and I got a ROC curve that looks like this. This is very odd since there is no curvature. The data does have mostly qualitative features with only 2-3 quantitative features. The data is also mostly one class (i.e. 80% of my data is in one class and 20% in the second class). It seems to favor sensitivity over specificity. The other issue is that the performance says that it only has 3 data points. I have a lot of data so I am not sure why it says only 3 points. Could someone explain what is happening with this ROC curve and why there is no curvature?

This is my code for the ROC curve plot.

#Test
pred <- ROCR::prediction(as.numeric(predict(forest.out, newdata = 
x_test, type =  "response")), labels = as.numeric(y_test$Success))
perf <- performance(pred, measure = "tpr", x.measure = "for")

#Plot
plot(perf, xlab = "1-specificity", ylab = "sensitivity", main = "ROC 
curve")
abline(0,1, col = "grey")
plot(perf, add = TRUE, col = "red")

enter image description hereenter image description here

enter image description here

Best Answer

If you read the documentation for random forest’s predict, you’ll see that type="response" gives class labels. If you change the type, you can get probabilities (or votes), which will likely give a more typical ROC shape (many stair steps instead of 2 lines).