[Math] law of large numbers, convergence of random variables, probability distribution, and a lottery

probability

The Gambler's Fallacy tells us we cannot predict the next coin flip result based upon history. Several heads in a row do not mean the next flip is any more likely to be tails. However, the law of large numbers tells us that, over many trials, the number of heads and tails will become equal. So, a current preponderance of heads over tails will later be offset by more tails so the results equal over time. Consider a 54 ball random, fair lottery: after a large number of draws, say 54,000, each ball would be picked 1,000 times. Say, at some point in the trials, the number 17 has only been picked half as many times as all the other numbers. It must, in the remaining trials, up to the number of trials that make the number of trials a "large number", occur more often than the other numbers so that all numbers converge on being picked equally.

In this example, what is a "large number" of trials where the law of large numbers would tell us all numbers would have been picked equally? How would you generate a curve showing the convergence?

Thanks!

Best Answer

Pick a particular ball out of your 54 ball lottery, say ball 17. Fix some small number m; you want to perform sufficiently many trials such that the number of times you see ball 17 in all your trials differs from your expectation by no more than m. Since there can be freak instances where this is violated, no matter how many trillions of trials we perform, we can only ensure that this condition is probably not violated. That is to say, the probability that it is violated is less than some tiny number $\epsilon$.

The weak law of large numbers tells us this: there exists a large number $N$ such that if the number of independent trials $n$ (in this case, the number of times a ball is drawn with replacement) is greater than $N$, the probability that the number of times ball 17 is drawn in the $n$ trials is different than the expected number of draws (i.e., $\frac{n}{54}$) by more than m is less than $\epsilon$.

If we decide to go ahead and perform 54,000 trials, check the counts halfway and notice that ball 17 isn't popping up nearly as many times as one would expect in 24,000 trials (i.e., fell far short of 500), then that's just that. Again in the next 24,000 trials, we expect ball 17 to come up 500 times. What happens in the future is independent of what we saw in the past, so there is no "making up for lost ground effect". Notice that WLLN tells us that for a given $m$ and $\epsilon$, there exists an $N$ such that our prediction that the number of ball 17 draws will be within $m$ trials of the expectation is right at least $100*(1-\epsilon)$% of the time. So to ask if 54,000 trials is large enough to ensure that, first you have to have an $m$ and an $\epsilon$ in mind. Then you can use the fact that the number of draws of ball 17 follows a binomial distribution to figure out what your $N$ should be.