Two green balls, two red balls, and two yellow balls are placed in two green boxes, two red boxes, and two yellow boxes; one ball in each box.

inclusion-exclusionprobability

Two green balls, $2$ red balls, and $2$ yellow balls are placed in $2$ green boxes, $2$ red boxes, and $2$ yellow boxes; one ball in each box. Find the probability that no ball is in a box of the same color.

I tried to solve it with inclusion and exclusion method but don”t work.

Assume $a_1$ = probability of getting at least 1 blue ball, $a_2$ = getting at least 1 red ball, and $a_3$ = getting at least yellow ball

So $1 -P(a_1 \cup a_2 \cup a_3)$ would be the probably that no ball in a box of the same color.

$$P (a_1 \cup a_2 \cup a3)= p(a_1)+p(a_2)+p(a_3) – p (a_1 \cap a_2)-p(a_1 \cap a_3) -p(a_2 \cap a_3) + p(a1 \cap a2 \cap a3)$$

I get it should be $9 \frac{3!}{6!}$ – (the part I don’t know how to count) + $\frac{6}{6!}$

The conclusion is that it’s $1/9$ and from the classmate but I don’t understand his processes

Best Answer

Addendum-2 added to the end of this answer, to discuss one of the Original Poster's comments, following the posting, where they discuss simulation.


The numbers are small enough that it is much easier to use the direct approach rather than Inclusion-Exclusion. See also the Addendum, at the end of this answer, where I discuss Inclusion-Exclusion.

Assume, for convenience, that both the balls (of the same color) and boxes (of the same color) are distinguishable from each other.

So you are (in effect) associating each of the elements r-1, r-2, g-1, g-2, y-1, y-2, with the elements R-1, R-2, G-1, G-2, Y-1, Y-2, where the balls (r-1, ..., y-2) are first jumbled.

Then, the probability can be expressed as

$$\frac{N}{D} ~: D = 6!.$$

That is, there are $~6! = 720~$ different ways of jumbling the balls before assigning them to the boxes.

So, the problem reduces to computing $~N.~$

Note
Since this is a probability problem, you are allowed to assume that the balls and boxes are both distinguishable. That is, assigning labels to each of the balls and each of the boxes does not affect the probability that no ball went into a box of the same color.


Start with the two red balls. So either:

  • Both red balls go to the green boxes, G-1, G-2.
    This can be done in $~2~$ ways, re either r-1 goes to G-1 or it goes to G-2.

    If this happens, then you are then forced to send the two green balls to the two yellow boxes.

    This can also be done in $~2~$ ways.

    Then, the two yellow balls go into the two red boxes, which can also be done in $~2~$ ways.

    So, the enumeration here is $~2^3 = 8.$

  • Both red balls go to the yellow boxes.
    The problem is perfectly symmetrical between red, yellow and green.

    Therefore, the enumeration here is also $~2^3 = 8.$

  • One red ball goes to one of the green boxes, and one red ball goes to the yellow boxes.

    There are $~4~$ choices for which box receives r-1, and then $~2~$ choices for which box subsequently receives r-2.
    So, the enumeration here is $~4 \times 2 = 8.$

    Continuing to explore this specific case, reserve the factor of $~(8)~$ and then assume, without loss of generality that r-1 $~\longrightarrow~$ G-1, and r-2 $~\longrightarrow~$ Y-1.

    So here, the balls left are g-1,g-2,y-1,y-2 and the boxes left are G-2, Y-2, R-1, R-2.

    To complete the this case, one of the greens must go to Y-2, and one of the yellows must go to G-2. The enumeration here is $~2 \times 2 = 4.$

    So, for this case, reserve the second factor of $~(4),~$ which will be combined with the first factor of $~(8),~$ and then assume, without loss of generality that
    g-1 $~\longrightarrow~$ Y-2 and y-1 $~\longrightarrow~$ G-2.

    So, balls remaining are g-2,y-2 and boxes remaining are R-1,R_2. The distribution can then be completed in $~2~$ ways.

    So, the overall computation for this case is $~8 \times 4 \times 2 = 64.$


Therefore, $~N = 8 + 8 + 64 = 80.~$

Therefore, the probability is

$$\frac{N}{D} = \frac{80}{6!} = \frac{1}{9}.$$


Addendum

Normally, even for problems more complicated than this, Inclusion-Exclusion is the preferred approach. See this article for an introduction to Inclusion-Exclusion. Then, see this answer for an explanation of and justification for the Inclusion-Exclusion formula.

Following the model in the second link, in this case however, I reject the Inclusion-Exclusion approach because, either:

  • You must (for example) let $~S_1~$ denote the subset of $~S~$ where r-1 goes into a red box.
    Then, you have the $~6~$ subsets $~S_1, \cdots, S_6~$ and so you have Inclusion-Exclusion analysis that runs $~6~$ layers deep.

  • Alternatively, you must (for example) let $~S_1~$ denote the subset of $~S~$ where at least one of the red balls goes into any one of the Red boxes. Although you then only have the three subsets $~S_1, S_2, S_3,~$ the analysis of each subset (and the analysis of their respective intersections) would be very convoluted.


Addendum-2

In one of the comments following the posted question, the OP (i.e. original poster) mentions that he arrived at the answer of $~\frac{1}{9}~$ via simulation.

This is actually ambiguous, and can refer to one of two things:

One interpretation is that when you regard the balls and the boxes as each distinguishable, there are $~6!~$ ways of jumbling the balls before sending them to the boxes.

So, you could write a computer program to loop through each of these $~720~$ distinct distributions of the balls, and count how many of the distributions have no ball being sent to a box of the same color as the ball.

You should get $~80~$ of the $~720~$ equally likely distributions.

The alternative interpretation of simulation is that you ran (for example) $~10000~$ experiments. In each experiment, you used a random number generator to determine, in turn, which box would receive each of r-1, r-2, g-1, g-2, y-1, y-2.

Assuming that you did that, you would expect that of the $~10000~$ experiments, approximately $~1111~$ of them would result in no ball being sent to a box of the same color. This result, coupled with the previously made suggestion that the correct answer is $~\dfrac{1}{9}~$ would also reinforce this suggested answer.

In effect, you have two different ways of sanity-checking analysis through computer software.