Solved – Does AUC for multiple logistic regression make sense if prediction is not the goal

auclogisticmultiple regression

Does it makes sense to calculate the AUC if I do not want to use my multiple logistic regression model for predictions? I only want to calculate some odds ratios and test if the variables in my model have a significant influence and adjust for some covariates.

Best Answer

The AUC doesn't actually tell you how well your model will predict out of sample. If you want that, you need to cross-validate and get the mean out-of-sample AUC.

More basically, the AUC tells you how well ordered your predicted probabilities are. That is, if you compared the predicted probabilities for two units, $i$ and $i'$, and $p(y_i=1) > p(y_{i'}=1)$, then you would prefer that $y_i = 1$ and $y_{i'} = 0$. The AUC is the proportion of times that is the case.

So computing the AUC for your model in-sample can provide one kind of information about the model's performance / goodness of fit. You certainly don't have to want to know that when you use the model to compute odds ratios, but it doesn't hurt as one more piece of information about whether your model is decent.

To get a fuller sense of how the AUC works, it may help you to read this excellent CV thread: How to calculate Area Under the Curve (AUC), or the c-statistic, by hand.

Related Question