[Math] Why does “order matter” when calculating the probability that no people out of seven were born in winter

combinatoricsprobability

SUMMARY

Here's a problem from Harvard's Stats 110 course.

For a group of 7 people, find the probability that all 4 seasons
(winter, spring, summer, fall) occur at least once each among their
birthdays, assuming that all seasons are equally likely.

The problem and its solution can be found here.

The hard part for me comes when calculating the probability that no people out of seven were born in winter. I don't understand why "order matters" here. (I understand the inclusion-exclusion part of the main problem, just not the "order matters" part.)

WHAT I TRIED

As one part of this problem, I need to calculate the probability that none of the seven people has a birthday in the winter. I tried to calculate this by considering the people as indistinguishable and lumping them into 4 different categories: birthdays in winter, spring, fall, or summer. Thinking like this, I used the "stars and bars" formula to calculate the probability that none of the birthdays fell in the winter category. This can be calculated as

$$P(A) = \dbinom{7 + 3 – 1}{3} / \dbinom{7+4-1}{4} $$

WHAT HARVARD DID

Harvard says this is wrong and that since order matters, the calculation is (the much easier)

$$P(A) = (3/4)^7$$

It seems like it could go either way, and I can actually model either in R. So what am I doing wrong?

Best Answer

The "Stars and bars" formula counts the number of ways to arrange a list of stars and bars. It gives the wrong answer for your problem because these arrangements are not equally probable, so you can't simply divide the number of "favorable" arrangements by the total possible number of arrangements.

For example, when assigning birth seasons (call them A, B, C, D) to four people, the arrangement ****||| is less likely than the arrangement *|*|*|* . Why? Think of lining up these four people in a row and having each person pick a season. The first arrangement of stars and bars corresponds to AAAA, while the second arrangement corresponds to ABCD, ACBD, ADBC, ADCB, and many more, for a total of $4!$ possibilities. This is the sense in which order matters---the $4!$ possibilities are distinct, and as a whole don't have the same prob as the the single possibility AAAA.