Solved – Equivalent of AUC (area under the ROC curve) for two variables

aucclassification

I was wondering if there is a way to compute AUC using two variables instead of one as predictors.

I got two populations after a follow-up, divided in Cases and Controls according to whether they had developed or not a pathology during the follow-up. There are also two independent variables in the data and now I would like to find a way to combine these two to see if any combination of them could enhance predictability.

I've already tried combining many variables (a total of 39, including the two I am asking this question about) with principle component analysis (PCA), but it did not improve the predictability, so I want to try something else. I know that some use C-index, but I know nothing about it.

Clarification. Suppose that the two variables I have are: a measure of the length of an heartbeat, also called RR, and a measure of the quantity of the blood ejected at every heartbeat, also called LVEF. I know that these two variables yield AUCs of 0.61 and 0.65, respectively, when used separately. Is there a way I could combine these two AUCs for RR and LVEF?

Best Answer

If you want to compute the AUC for the combination of two variables, you can include both as predictors in a logistic regression model and compute the AUC using the predictions from the model. See some sample code:

library(pROC)
data(infert)
auc(case ~ age, data=infert)   #Compute AUC for predicting case with the variable age
mod1<-glm(case ~ age + parity, data=infert, family="binomial")  #Logistic regression model
auc(case ~ predict(mod1), data=infert)  #Compute AUC for predicting case with your model