Probability – What Is the Name of the Statistical Fallacy Whereby Outcomes of Previous Coin Flips Influence Beliefs?

distributionsprobabilitysampling

As we all know, if you flip a coin that has an equal chance of landing heads as it does tails, then if you flip the coin many times, half the time you will get heads and half the time you will get tails.

When discussing this with a friend, they said that if you were to flip the coin 1000 times, and lets say the first 100 times it landed heads, then the chances of landing a tail was increased (the logic being that if it is unbiased, then by the time you have flipped it 1000 times you will have roughly 500 heads and 500 tails, so tails must be more likely).

I know that to be a fallacy, as past results don't influence future results. Is there a name for that particular fallacy? Also, is there a better explanation of why this is fallacious?

Best Answer

It's called the Gambler's fallacy.

If you "know" that the coin is fair

then we still expect the long run proportion of heads to tend to $0.5$. This is not to say that we should expect more (than 50%) of the next flips to be tails, but rather that the initial $500$ flips become irrelevant as $n\rightarrow\infty$. A streak of $500$ heads may seem like a lot (and practically speaking it is), but

if $250$ of the next $500$ flips are heads then the sample proportion becomes $$\hat p = \frac{500 + 250}{1000} = 0.75.$$
if $250$ of the next $500$ flips are heads then... $$\hat p = \frac{500+250+250}{1500} \approx 0.67$$
if $100000$ of the next $200000$ flips are heads then... $$\hat p = \cdots \approx 0.501.$$

This is the Law of Large Numbers.

On the other hand...

if I were to flip a coin in real life and see $500$ heads in a row, I would start to seriously doubt that the coin is actually fair. (Interesting side note, it is hard (impossible?) to actually bias a coin in real life. The only realistic values of $p$ are $0$, $0.5$ and $1$, but we will ignore this for the sake of an answer).

To account for this possibility, we could use a Bayesian procedure from the outset. Rather than assume $p=1/2$, suppose we specify the prior distribution $$p \sim \text{Beta}(\alpha, \alpha).$$

This is a symmetric distribution, which encodes my a priori belief that the coin is fair, i.e. $E(p) = \frac{1}{2}$. How strongly I believe in this notion is specified through the choice of $\alpha$, since $Var(p) = \frac{1}{8(\alpha+0.5)}$.

$\alpha = 1$ corresponds to a uniform prior over $(0,1)$.
$\alpha = 0.5$ is Jeffrey's prior - another popular non-informative choice.
Choosing a large value of $\alpha$ gives more credence to the belief that $p=1/2$. In fact, setting $\alpha = \infty$ implies that $Pr(p=1/2) = 1$.

Applying Bayes rule directly, the posterior distribution for $p$ is $$p|y \sim \text{Beta}(\alpha+y, \alpha+n-y)$$ where $y = \text{number of heads}$ and $n = \text{number of flips}$. For instance, if you choose $\alpha = 1$ and observe $n=y=500$, the posterior distribution becomes $\text{Beta}(501, 1)$ and $$E(p|y) = \frac{\alpha + y}{2\alpha + n} = \frac{501}{502} \approx 0.998$$ indicating that I should bet on heads for the next flip (since it is highly improbable that the coin is fair).

This updating process can be applied after each flip, using the posterior distribution after $n$ flips as the prior for flip $n+1$. If it turns out that the $500$ heads was just a (astronomically) improbable event and the coin really is fair, the posterior distribution will eventually capture this (using a similar argument to the previous section).

Intuition for choosing $\alpha$: To help understand the role of $\alpha$ in the Bayesian procedure, we can use the following argument. The mean of the posterior distribution is equivalent to the maximum likelihood estimate of $p$, if we were to augment the data with a series of $2\alpha$ "hypothetical flips", where $\alpha$ of these flips are heads and $\alpha$ of these flips are tails. Choosing $\alpha=1$ (as we did above) suggests that the augmented data is $501$ heads and $1$ tails. Choosing a larger value of $\alpha$ suggests that more evidence is required to change our beliefs. Still, for any finite choice of $\alpha$, these "hypothetical flips" will eventually become irrelevant as $n\rightarrow\infty$.

Solved – Flipping Coins : Probability of Sequences vs Probability of Individuals

By default, str_count does not count overlapping occurrances of the specified pattern. The substring 1111 can overlap with itself substantially, whereas the substring 1110 cannot overlap with itself. Consequently, your calculation for the first substring is substantially biased --- you are substantially undercounting the number of times this pattern actually occurs in your simulation. Try this alternative method instead:

#Flip the coin many times
set.seed(1)
n     <- 10^8
FLIPS <- sample(c(0,1), size = n, replace = TRUE)

#Count the proportion of occurrences of 1-1-1-1
PATTERN.1111 <- FLIPS[1:(n-3)]*FLIPS[2:(n-2)]*FLIPS[3:(n-1)]*FLIPS[4:n]
sum(PATTERN.1111)/n
[1] 0.06246614

#Count the proportion of occurrences of 1-1-1-0
PATTERN.1110 <- FLIPS[1:(n-3)]*FLIPS[2:(n-2)]*FLIPS[3:(n-1)]*(1-FLIPS[4:n])
sum(PATTERN.1110)/n
[1] 0.0624983

With this alternative simulation (which counts overlapping occurrences of the patterns) you get proportions for the two outcomes that are roughly the same. If the coin flips are in fact independent and "fair" then each player has the same probability of winning the wager. Mathematically, the true probability of any run of four outcomes is $1/2^4 = 0.0625$, so that it what the above simulations are effectively estimating; the remaining small disparity in the simulation is due to random error.

Best Answer

Related Solutions

Solved – Expectation of 500 coin flips after 500 realizations

If you "know" that the coin is fair

On the other hand...

Solved – Flipping Coins : Probability of Sequences vs Probability of Individuals

Related Question