Solved – How to calculate probability percentage for logistic regression with threshold

logisticprobabilityrocthreshold

I am trying to understand how probability works with a threshold for logistic regression.

  1. I understand the basics of how to calculate probability.

    log odds = intercept+value1*coef1
    odds = exp(log odds)
    prob = odds / (1+odds)
    
  2. I understand that a threshold is used to find the optimal mix of correct predictions (precision, f1, etc.).

However, how do we interpret a probability in light of a threshold? For example, if a threshold is 0.195, and a user has a probability of 0.0975 are they:

  • 50% likely to respond (1) since they are 50% towards the threshold?
  • Or are they still 0.0975% likely to respond (1), irrespective of how we consider the fact that anyone who is more then 0.195% likely is going to respond (1)?

Best Answer

They are predicted to have a $0.0975$ probability of responding $1$. The threshold you have chosen has no effect on the probability, only what you do with the predicted probability later. I should note that these are just the model's estimates, they need not be the true probabilities of responding $1$—models can certainly be wrong!

As a final note, I need to point out that positing a threshold and calling all observations above it, $1$, is not generally a good thing to do. There is more information in the predicted probability than in the attempted classification.

Related Question