[Math] Covariance of multinomial distribution

covarianceprobability distributionsrandom variablesstatistics

Let $X = (X_1,\ldots, X_k)$ be multinomially distributed based upon $n$ trials with parameters $p_1,\ldots,p_k$ such that the sum of the parameters is equal to $1$. I am trying to find, for $i \neq j$, $\operatorname{Var}(X_i + X_j)$. Knowing this will be sufficient to find the $\operatorname{Cov}(X_i,X_j)$.

Now $X_i \sim \text{Bin}(n, p_i)$. The natural thing to say would be that $X_i + X_j\sim \text{Bin}(n, p_i+p_j)$ (and this would, indeed, yield the right result), but I m not sure if this is indeed so.

Suggestions for how to go about this are greatly appreciated!

UPDATE: @grand_chat very nicely answered the question about the distribution of $X_i + X_j$. How would we go about computing the variance of $X_i – X_j$? As @grand_chat correctly points out, this cannot be binomial because it is not guaranteed to be positive. How, then, should one go about computing the variance of this random variable?

UPDATE 2: The answer in this link answers the question in my UPDATE.

Best Answer

$X_i+X_j$ is indeed a binomial variable because it counts the number of trials that land in either bin $i$ or bin $j$. The $n$ trials are independent, and the probability of "success" is $$P(\text{trial lands in $i$}) + P(\text{trial lands in $j$}) = p_i+p_j.$$