The correct physical interpretation of Binomial distribution and bernoulli trial in this example

binomial distributionphilosophyprobabilityprobability distributionsprobability theory

We know that every random variable can have a probability distribution. Examples include the number of heads in many tosses, or the number of ones on a dice after many rolls and so on.

Suppose we use the binomial distribution to model this random variable. Let us take an example. We are tossing a single coin $100$ times and checking for heads. By plugging this into the binomial distribution I get a beautiful graphical representation of probability on one side and the no. of heads on the other axis. This graph would peak at $50$ heads with a roughly $0.08$ probability.

However, there is also a physical interpretation. It basically means I toss coin a hundred times, and note the number of heads. Then I repeat this experiment thousands and thousands of times, and note the frequency of the occurance of each number of head. This frequency represents the probability or the height of the graph, in the binomial distribution graph. As one would expect, $50$ heads would appear roughly $8$ percent of the time. This can be shown easily with computer simulation, as 3blue1brown does with random number generators.

Now we have a mathematical as well as a physical meaning of what the binomial distribution represents.

Now Imagine the following scenario.

There is a bag with $100$ balls inside it. Some of them are blue, red and other colours. We don't know how many are of each colour. However, the colours are fixed ofcourse, we just don't know the amount.

What we do is, we pick a single ball at random, and note its colour. This is repeated many many times, and it appears that blue balls appear $20$ percent of the time. Since it is impossible to repeat the experiment infinite times, we can never know what is the exact percentage of blue balls appearing. Since we know the total number of balls, but not the exact probability of getting a blue ball, we can never know the exact number of blue balls inside our bag.

Hence the number of blue balls inside our bag is a random variable, and thus, it must have a distribution. In our sampling, we found out that the probability of getting a blue ball was $0.2$. This is not the true probability of getting a blue ball from that bag, more like our best estimate of the true probability.

Hence we can use binomial distribution to find the probability of different numbers of blue balls being inside the bag. The mean of this distribution would be $20$ balls, and as the total number of trials tend to infinity, the actual number of blue balls would tend towards this mean. However, in our trials, the probability of getting $20$ blue balls, would be about $9.93 $ percent.

Now mathematically this is all well and good. However, physically it doesn't seem to make sense.

Let us see, how would we interpret this binomial distribution in the physical sense, just like we did for our coin tosses. In case of the coin tosses, we did the experiment many many times and noted the frequency of a particular number of heads, and we used this to create a distribution.

Suppose, we do the same thing here. So, we empty the bag and count the number of blue balls and repeat this experiment many many times. According to the binomial distribution, in about $9.93$ percent of the cases, I should get $20$ blue balls out of the bag. In other cases, I'd get other results with different probabilities. However, if I'm doing the experiment with the same bag this creates a problem, since even though I don't know the number of blue balls in the bag, I do know that it is a constant. The same bag cannot give two different number of blue balls in two consecutive experiments.

So, the physical interpretation of the binomial distribution seems to fail over here.

One solution that I can think of is, instead of checking the same bag again and again, to get a frequency, what if I check thousands of different bags with different number of blue balls. Each of them would have a different number of blue balls in them from $0$ to $100$. However, the same bag won't have different blue balls in consecutive throws, because we are not checking the same bag, we are checking different bags. Since we don't know the exact number of blue balls in the bag, we essentially don't know which bag is it out of all these bags.

So, the binomial distribution is no longer about the number of blue balls in the same bag, directly. It is more about the different bags with different number of blue balls in them. So in a sense, the number of blue balls is not exactly the random variable, in our problem as we initially guessed. Its actually the bags, that is the random variable. Different bags have different number of blue balls, and we basically don't know which bag is the real one. To say that, in $9.93$ percent of the times, a total of $20$ blue balls appear, would be equivalent to saying that $9.93$ percent of the time, bags with $20$ blue balls turn up. This is correct, because the bags with $20$ blue balls in them, would be more likely to give us a $20$ percent of picking up a random blue ball. Bags with $100$ blue balls or $99$ would be less likely to give us a $20$ percent chance of picking a blue ball.

Would this be the correct physical interpretation of the binomial distribution ? Instead of the bernoulli trial being checking a single bag for number of blue balls, each bernoulli trial is basically checking all these different bags. I'm doing all this, because I single bag cannot give two different numbers of blue balls in successive bernoulli trials, even if we don't know the exact number of balls. So the question should be more like there are several bags with different number of blue balls from $0$ to $100$, given the probability of picking up a random blue ball is almost $0.2$, which one of these bags is most probable and so on. Hence, bags with $20$ blue balls would be the mean of this distribution of different bags. We are essentially checking how likely a certain bag would give us exactly $20$ percent chance of picking up a blue ball at random, since that is the only information we have.

Is this interpretation correct ? Mathematically it doesn't make a difference, since the binomial distribution formula describes both the physical cases equally. If the colour of the ball was not constant, and we were checking the same bag, I'd have got the exact same results. However, the philosophical and the physical interpretations are somewhat different. Like tossing a single coin $100$ times vs tossing $100$ coins once. Mathematically it is the same, physically not so.

Thanks for your time.

Best Answer

Let $n = $ any large number : say $1000$.

Let $b$ denote the number of blue balls in the bag.

Let $f(b)$ denote the probability of exactly $20\%$ of the $n$ trials succeeding in showing a blue ball, when a ball is selected with replacement from the bag.

Let $W$ denote $\displaystyle \sum_{i = 0}^{100} f(b)$.

Then, the expected number of blue balls in the bag is

$$\frac{\sum_{i=0}^{100} \left[i \times f(i)\right]}{W}.\tag1 $$

$W$ in the denominator serves to normalize the sum of the weights (i.e. the probabilities) associated with each possible number of blue balls.

$\displaystyle f(b) = \binom{1000}{200} \times \left[\frac{b}{100}\right]^{(200)} \times \left[\frac{100 - b}{100}\right]^{(800)}.$