Solved – With multinomial regression, how to predict an event and get the ROC curve

predictive-modelsrregression

I'm using the multinom package in R to run a multinomial logistic regression model. My dependent variable has 3 levels and as the output, I'm getting the probability for each of the level.

Currently, I have the VIF, AIC, p-values and confusion matrix in the model.

I have the following questions:

  1. I want a single output based on the probabilities. How do I decide a "cut-off" for deciding the "best event"?

  2. Does it make sense to get an ROC curve here? If yes, then how do I get one?

  3. What are the things I should look at for the validation of the model?

Best Answer

  1. The cutoff (if it makes sense in your problem, for example when you are doing actual decision making and not only model evaluation), should be decided taking into account several possible outcomes. Depending on your motivations, you might want to maximize specificity, or sensitivity, or find a compromise between both. A paper I read recently advocates the use of a threshold that minimizes the difference between sensitivity and specificity if both types of error are equally weighted (in the ROC curve that's the point that crosses the line between the northwestern-most and southeastern-most points, or Sensitivity = Sensibility) [1].
  2. You can obtain one-vs-rest ROC curves and compute their AUCs. You can also estimate a single AUC value for multiclass classification using the strategy described by Hand & Till [2].
  3. Too broad, but make sure you select reasonable resampling strategies and evaluation metrics. Nested cross-validation or alternatively double bootstrapping validation (or other kinds of nested resampling) can be used if you want to estimate the performance of your model on unseen data and also the applicability of your model selection method.

[1] Lobo, J. M., Jiménez‐Valverde, A., & Real, R. (2008). AUC: a misleading measure of the performance of predictive distribution models. Global ecology and Biogeography, 17(2), 145-151.

[2] Hand, D. J., & Till, R. J. (2001). A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine learning, 45(2), 171-186.