[Math] Proof concerning the multinomial distribution

multinomial-distributionprobability distributions

Despite a long search I was not able to find a rigorous proof of the fact that a random vector having a multinomial distribution with parameters p (the vector of probabilities) and n (the number of trials) can be written as the sum of n independent random vectors all having a multinomial distribution with parameters p and 1. Can anyone suggest where to look?

Best Answer

Suppose $X_1,\ldots,X_n$ are independent identically distributed random variables and $$ \Pr(X_1 = (0,0,0,\ldots0,0,\underset{\uparrow}{1},0,0,\ldots,0,0,0)) = p_i $$ where there are $k$ components and the single "$1$" is the $i$th component, for $i=1,\ldots,k$.

Suppose $c_1+\cdots+c_n = n$, and ask what is $$ \Pr((X_1+\cdots+X_n)=(c_1,\ldots,c_n)). $$ The vector $(c_1,\ldots,c_n)$ is a sum of $c_1$ terms equal to $(1,0,0,0,\ldots,0)$, then $c_2$ terms equal to $(0,1,0,0,\ldots,0)$, and so on. The probability of getting any particular sequence of $c_1$ terms equal to $(1,0,0,0,\ldots,0)$, then $c_2$ terms equal to $(0,1,0,0,\ldots,0)$, and so on, is $p_1^{c_1}p_2^{c_2}\cdots p_k^{c_k}$. So the probability we seek is $$ (p_1^{c_1}p_2^{c_2}\cdots p_k^{c_k}) + (p_1^{c_1}p_2^{c_2}\cdots p_k^{c_k}) + \cdots + (p_1^{c_1}p_2^{c_2}\cdots p_k^{c_k}), $$ where the number of terms is the number of distinguishable orders in which we can list $c_1$ copies of $(1,0,0,0,\ldots,0)$, $c_2$ copies of $(0,1,0,0,\ldots,0)$, and so on. That is a combinatorial problem, whose solution is $\dbinom{n}{c_1,c_2,\ldots,c_k}$. Hence the probability we seek is $$ \binom{n}{c_1,c_2,\ldots,c_k} p_1^{c_1}p_2^{c_2}\cdots p_k^{c_k}, $$ so there we have the multinomial distribution.

Related Question