[Math] Sample space for identical objects

discrete mathematicsprobability

Sample space for a set of balls in an urn out of which say $p$ are of blue color (hence identical), $q$ are of red color is written as a multiset and not as a set, or rather when it is written as a set the identical objects are written as different objects in the following way:-

$\text{Set representation} – \{b_1,b_2,\ldots,b_p,r_1, r_2,\ldots,r_q\}\\
\text{Multiset representation}-\{b,b,\ldots\text{p times}, r,r,\ldots\text{q times}\}$

And based on the above sample space if we are asked to find the probability of selecting a blue ball we get it as $\dfrac{p}{p+q}$, which was found using the classical definition of probability. According to the classical definition of probability, probability of an event $E$ as given as
$$P(E)=\dfrac{n(E)}{n(S)}$$

Now why is that the ways of selecting $r$(here, $r=1$) things out of $n$ identical things is ${n}\choose{r}$ and not $1$ as is the case in combinatorics.


Edit 1:-

To further clarify my point consider these two contrasting cases below:-

Case 1:-

Consider an urn in which there are $30$ distinct balls out of which there are $10$ distinct blue balls and $20$ distinct white balls. Let's say the distinction b/w the balls of identical color is made by marking them with numbers which cant be felt by the one drawing the ball. Now in a random draw if we are asked the probability of picking a blue ball, then according to the classical definition of probability it will be $\dfrac{10}{30}=\dfrac{1}{3}$.

We had arrived the above probability because we know that
$$n(E:\text{drawing a blue ball})={{10}\choose{1}}=10\\
n(S:\text{drawing a ball from the earn})={{30}\choose{1}}=30$$

Just listing the sets $E$(event space) and $S$(sample space) for comparing these with the next case.

$E=\{B_1,B_2,B_3,\ldots,B_{10}\}\\
S=\{B_1,B_2,B_3,\ldots,B_{10},W_1,W_2,W_3,\ldots, W_{20}\}$

Case 2:-

The setting for this case is the same as the previous case the only difference is that that the balls of same color are indistinguishable. So, if we are asked the probability of drawing a blue ball the probability would come out as $\dfrac{1}{2}$.

My thinking behind this probability is that the sample space for this random draw would be $$S=\{W,B\}$$ as all the white balls are identical to each other and so are all the blue balls. Similarly, the event space would be $$E=\{B\}$$ as all the blue balls are identical.

But by this logic the probability of drawing a blue ball would still be $\dfrac{1}{2}$, when the blue balls are present in a vast quantity as compared to the white balls which would be other than the expected result.

So, my question again boils down to the same one as in the pre edit part of the post, i.e.

Why do we use the multi-set representation for identical objects or while writing it in set notation why do we consider the identical objects as distinct objects?

Best Answer

$$P(E)=\dfrac{n(E)}{n(S)}$$ if and only if all the simple outcomes in the sample space have the same probability.

The balls may be indistinguishable, but there are more white than blue balls so there's more chance you'll select white than blue. In other words, the two simple outcomes in the sample space $S=\{W,B\}$ have different probabilities, violating the above condition.

Another classic example is flip two coins. One sample space would give the number of heads. $$S_2=\{0,1,2\}$$ However, you can't use it to find the probability of $0$ heads, you'll get $\frac 13$ instead of $\frac 1 4$. The sample space $$S_3=\{HH,HT,TH,TT\}$$ has equiprobable sample points so it can be used.

Related Question