Estimating hot/cold lottery ball totals

lotteriesprobability

A large percentage of lottery players rely on past ball ‘performance’ in choosing their numbers. If a ball appears more than the average number of times, half of the players believe it it is somehow on a ‘run’ and more likely to be drawn again, while the other half believe that balls that haven’t appeared for a long time are ‘overdue to appear’ thus are also more likely to be drawn. While we can dismiss lottery prediction methods of course, it poses an interesting question in explaining the math behind the process.

Assume we have a weekly game where 6 main numbers and 2 bonus numbers are drawn from a field of 45. In ten years of play, the total number of drawn balls would be 52 x 8 x 10 = 4,160.

The chance that a given number will appear in any draw is clearly 8/45 while the chance it won’t appear is 37/45. My question is how can we estimate these expected ’weeks since drawn’ intervals over a given timespan?

If I wanted to estimate how many of our 4,160 total balls would be drawn, then skip 21 weeks before reappearing the logical approach seems to be:

4,160 x (37/45)^{21} x (8/45) = 12

If we run through all expected ’skip’ intervals using this method I find that one ball of our 4,160 can be expected to go missing for about 37 weeks without being redrawn.

The results also show similarities to actual observed data, though some variation in figures can be expected through pure chance. However, this seems to be a rather simplistic method which was easier than I expected. Is my assumption correct or have I overlooked some other important factor here?

Thanks!

Best Answer

Except for a small detail you are correct. The balls drawn in the last $22$ weeks have no chance of success, so you should do the calculation for $4160-8*22=3984$ balls.

To confirm your reasoning, let $X_i$ be a random variable whose value is if the number on ball $i$ reappears after a lapse of $21$ weeks, and $0$ otherwise. You are looking for the expected value of $$X=\sum_{i=1}^{3984}X_i$$ By linearity of expectation, $$E(X)=E(\sum_{i=1}^{3984}X_i)=\sum_{i=1}^{3984}E(X_i)$$ and $E(X_i)$ is just the probability you calculated.