[Math] Binomial within a multinomial distribution

binomial distributionprobability

Let $\textbf{X} = (X_1, . . . ,X_k)$ denote the random vector of counts, and let $\textbf{x} = (x_1, . . . , x_k)$ denote a possible value for that random vector. Finally, let $f (x | n, \textbf{p})$ denote the joint p.f. of $X$. With $\textbf{p} = (p_1, \dots , p_k)$. There $f$ denotes the multinomial distribution.

We have to prove that $Y = X_1 + X_2 + \dots + X_l$ where $l < k$. So we have to prove that $Y$ is distributed as a binomial distribution with parameters $n$ and $\sum_{i=1}^{l}p_i$.

Here is my approach and I am stuck at a point.

Let $X_i = X_{i_1} + X_{i_2} + \dots + X_{i_n}$

Where $X_{i_1}$ is $1$ with probability $p_i$ or $0$ otherwise.
$X_{i_1}$ denotes whether we get object $i$ in first place, similarly for others.

Now
$$Y = X_{1_1} + X_{1_2} + \dots + X_{1_n} \\
+ X_{2_1} + X_{2_2} + \dots + X_{2_n} \\
\vdots \\
+ X_{l_1} + X_{l_2} + \dots + X_{l_n}$$

Now grouping in this way

$$Y = X_{1_1} + X_{2_1} + \dots + X_{l_1} \\
+ X_{1_2} + X_{2_2} + \dots + X_{l_2} \\
\vdots \\
+ X_{1_n} + X_{2_n} + \dots + X_{l_n}$$

Now let us discuss for $X_{1_1} + X_{2_1} + \dots + X_{l_1}$, this random variable can take a value $1$ or $0$. It is $1$ when any one of it happens ie first slot is a object from $1$ to $l$ which happens with a proability of $\sum_{i=1}^{l}p_i$. Similarly for the other rows. Now each row is a Bernoulli with probability $\sum_{i=1}^{l}p_i$ and there are $n$ rows. But are these Bernoulli variables independent. Because if they were independent I could conclude that this is a Binomial with the required parameters.

Best Answer

Note that $X=(X_1,...,X_k)=Y_1+\cdots Y_n$ where the $Y$'s are i.i.d. with $$Y_1=(X_{1,1},...,X_{k,1}) \sim \text{Multinomial}(p_1,..,p_k,1).$$ Denote $S_{j,1}=X_{1,1}+\cdots X_{j,1}$ for $j<k$. Then we see that \begin{align*} P(S_{j,1}=1) &= P(S_{j,1}=1,S_{n,1}=1)=P(\bigcup_{k=1}^j(X_{k,1}=1,X_{l,1}=0 \, \forall l\not=k)) \\ &=\sum_{k=1}^j P(X_{k,1}=1,X_{l,1}=0 \, \forall l\not = k) \\ &= \sum_{k=1}^j p_k \end{align*} and $$P(S_{j,1}=0)=1-\sum_{k=1}^j p_k.$$ Hence $$S_{j,i} \sim \text{Ber} \left(\sum_{k=1}^j p_k\right) \quad \quad \text{ for } i\in\{1,...,n\},$$ which furthermore by the independence of the Y's are mutually independent. Now note for $j<k$ that $$Y=X_1+\cdots X_j=S_{j,1}+\cdots +S_{j,n}\sim \text{Bin}\left(n,\sum_{k=1}^j p_k\right)$$

Related Question