Solved – Probabilities vs. Odds Ratios

oddsodds-ratioprobability

Suppose we know that the probability of a female getting into a program are $p=.7$ and $q=1-.7=.3$ for males. Then we know:

$$\mbox{odds}(\mbox{female}) = .7/.3 = 2.33333$$
$$\mbox{odds}(\mbox{male}) = .3/.7 = .42857$$

We could use this information to compute an odds ratio:

$$OR = 2.3333/.42857 = 5.44$$

Thus a female is 5.44 times more likely to get in.

But since the probability of an event is just $\frac{p}{p+q}$ the probability of a male getting in is 30%, while the probability of a female getting in is 70%. We might reason then that the probability of a female getting in is roughly double (2.333). What are we trying to accomplish with the odds ratio and why does it result in such a different answer from comparing probabilities?

Best Answer

In an observational study, the odds ratio can be calculated either by conditioning on exposure ($E$ and its complement $E'$) or outcome ($C$ and its complement $C'$):

$\psi = \frac{P(C|E)/P(C'|E)}{P(C|E')/P(C'|E')} = \frac{P(E|C)/P(E'|C)}{P(E|C')/P(E'|C')}$

According to your observations, the binary outcome is whether subjects get into the program ($C$) or not ($C'$), and the exposure could be femininity ($E$) versus masculinity ($E'$).

The problem is that your data is ambiguous. You say:

Suppose we know that the probability of a female getting into a program are $p=0.7$

If this means $P(C|E)=0.7$ then clearly $P(C'|E)=1-P(C|E)=0.3$. But we have no information about $P(C|E')$ or $P(C'|E')=1-P(C|E')$. In other words, what's the "probability of a male getting into the program"? (It's not 0.3).

Likewise, if you interpret that as $P(E|C)=0.7$ then $P(E'|C)=1-P(E|C)=0.3$. But now you're missing $P(E|C')$ and $P(E'|C')=1-P(E|C')$.

So you need to ask yourself whether that $0.7$ number is the probability of getting into the program for females, or the probability of being a female for those who got into the program.

As for what the odds ratio is used for,

  • like you imply, it measure relative odds in a single number that may be less than, equal to, or greater than 1 and so summarises the data well
  • in can be calculated simply; for example, there's a simple formula for a $2\times2$ table of exposure counts versus outcome counts
  • it is used in calculations when analysing matched pairs in observational studies
  • it may be interpreted as the relative risk when certain outcomes (such as the incidence of lung cancer) are rare, and it approximates $\frac{P(C|E)}{P(C|E')}$ and there's probably some uses I haven't encountered yet.
Related Question