Probability – Understanding the Intuition Behind the Odds Scale

intuitionlogisticoddsodds-ratioprobability

What is an intuitive explanation of the odds scale?

In a logistic regression such as $$logit(p) = \beta_0 + \beta_1 x$$
we often interpret $\beta_1$ by looking at the odds ratio, $e^{\beta_1}$, which has the
interpretation that a unit increase in $x$ is associated with a change in the odds of "success" by a factor of $e^{\beta_1}$.

Say I'm making basketball shots, and my successful shots are well modeled by a logistic regression $logit(p) = \beta_0 + \beta_1 x$ where $x$ is meters from the basket. If $e^{\beta_1} = 0.5$, then each meter that I step farther from the basket halves my odds of making the shot. This "sounds" fine, but I don't have have an intuition about what halving or doubling my odds means.

I thought of one interpretation of odds, which is the following: my racehorse is in a race with 9 other horses, and all 10 are of equal ability. So each has odds of 1:9 of winning. Then one way of thinking about the odds ratio is that halving my odds, or doubling my odds-against, is like doubling the number of opposing horses to 18.

In site searches I haven't found any intuitive interpretation: here says it's not intuitive, and here suggests that when people say "twice as likely" they aren't clear which scale is being used.

Best Answer

In the frequency interpretation, probability is the number of successful shots divided into the total number of shots (at each distance $x$). The odds is the number of successful shots per failure. That seems to be an intuitive description!

So, in your example, 1 meter longer from the basket, the odds is halved. So if the number of failures is the same as before, the number of successful shots is half.

In your other, horses, example, your interpretation seems fine.

Related Solutions

Solved – Interpretation of Odds Ratio of Zero

It's easiest to illustrate what is going on with a simple example with a single predictor that is dichotomous (e.g., to distinguish two groups). Suppose these are the data (using R for illustration):

y   <- c(0,0,0,1,1,0,0,0,0,0)
grp <- c(0,0,0,0,0,1,1,1,1,1)
cbind(grp, y)

So:

      grp y
 [1,]   0 0
 [2,]   0 0
 [3,]   0 0
 [4,]   0 1
 [5,]   0 1
 [6,]   1 0
 [7,]   1 0
 [8,]   1 0
 [9,]   1 0
[10,]   1 0

There are 5 observations for each group. In group 0 (the reference group), there are 2 events, so the odds of the event are $2/3$. So, the log odds of the event happening are $\ln(2/3) = -0.4055$. In the second group, the are 0 events, so the odds of the event happening are $0/5$. And the log odds of the event are $\ln(0/5) = -\infty$. So, the odds ratio of the event happening in group 1 versus 0 is $(0/5)/(2/3) = 0$. So, the log odds ratio is $\ln((0/5)/(2/3)) = -\infty$ or, equivalently, $\ln(0/5) - \ln(2/3) = -\infty$.

Now let's actually fit the model:

res <- glm(y ~ grp, family=binomial)
summary(res)

This yields:

Call:
glm(formula = y ~ grp, family = binomial)

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-1.01077  -0.75810  -0.00008  -0.00008   1.35373  

Coefficients:
             Estimate Std. Error z value Pr(>|z|)
(Intercept)   -0.4055     0.9129  -0.444    0.657
grp          -19.1606  4809.3409  -0.004    0.997

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 10.0080  on 9  degrees of freedom
Residual deviance:  6.7301  on 8  degrees of freedom
AIC: 10.73

Number of Fisher Scoring iterations: 18

So, the estimated intercept is $-0.4055$, which is the log odds in group 0. The coefficient for grp is the log odds ratio, which is estimated to be $-19.1606$. Hmmm, that's not quite $-\infty$. But after exponentiation, we get the odds ratio, which we can round to, let's say, 8 digits:

round(exp(coef(res)[2]), 8)

And that is in essence zero. The coefficient for grp is not $-\infty$ due to numerical issues when fitting the model when there is complete separation in the data (and to answer that part of your question: that is indeed exactly what is going on here). But for all practical purposes, the model implies an odds ratio that is in essence zero.

Solved – logit – interpreting coefficients as probabilities

These odds ratios are the exponential of the corresponding regression coefficient:

$$\text{odds ratio} = e^{\hat\beta}$$

For example, if the logistic regression coefficient is $\hat\beta=0.25$ the odds ratio is $e^{0.25} = 1.28$.

The odds ratio is the multiplier that shows how the odds change for a one-unit increase in the value of the X. The odds ratio increases by a factor of 1.28. So if the initial odds ratio was, say 0.25, the odds ratio after one unit increase in the covariate becomes $0.25 \times 1.28$.

Another way to try to interpret the odds ratio is to look at the fractional part and interpret it as a percentage change. For example, the odds ratio of 1.28 corresponds to a 28% increase in the odds for a 1-unit increase in the corresponding X.

In case we are dealing with an decreasing effect (OR < 1), for example odds ratio = 0.94, then there is a 6% decrease in the odds for a 1-unit increase in the corresponding X.

The formula is:

$$ \text{Percent Change in the Odds} = \left( \text{Odds Ratio} - 1 \right) \times 100 $$

Best Answer

Related Solutions

Solved – Interpretation of Odds Ratio of Zero

Solved – logit – interpreting coefficients as probabilities

Related Question