Probability – What Happens When Merging Random Variables in Dirichlet Distribution?

dirichlet distributiondistributionsprobability

Imagine that

$$ X_1,\dots,X_k \sim \mathrm{Dirichlet}(\alpha_1,\dots,\alpha_k) $$

Since $x_i \in (0,1)$ for all $x_i$ and $\sum_{i=1}^k x_i = 1$, then $x_i$'s follow the first two axioms of probability and Dirichlet can be (and is) used as "distribution over distributions". Intuitively it should follow that

$$ X_1,\dots,X_{k-2},X_{k-1}+X_k \sim \mathrm{Dirichlet}(\alpha_1,\dots,\alpha_{k-2}, \alpha_{k-1}+\alpha_k) $$

since the properties of $x_i$'s would not change and the total "mass" of $\alpha_i$'s would not change.

But it's probability density function is

$$ f(x_1,\dots,x_k) \propto \prod_{i=1}^k x_i^{\alpha_i – 1}$$

and

$$
x_{k-1}^{\alpha_{k-1} – 1} \times x_k^{\alpha_k – 1} \ne
(x_{k-1} + x_k)^{\alpha_{k-1} + \alpha_k – 1}
$$

So merging of random variables in Dirichlet distribution does not seem lead to Dirichlet distribution over $k-1$ variables. What does it lead to?

Best Answer

It is a Dirichlet distribution having the expected parameters.

To see this, note that the vector-valued random variable $\mathbf{X}=(X_1, X_2, \ldots, X_k)$ has the same distribution as the variable

$$\frac{1}{\sum_i^k Y_i}\left(Y_1, Y_2, \ldots, Y_k\right)$$

where $Y_i \sim \Gamma(\alpha_i)$ are independently Gamma distributed. Write $Y_i^\prime=Y_i$ for $i=1, 2, \ldots, k-2$ and $Y_{k-1}^\prime = Y_{k-1}+Y_k$. The sum of all the $Y_i$ equals the sum of all the $Y_i^\prime$ and the distribution of $Y_{k-1}^\prime=Y_{k-1}+Y_k$ is $\Gamma(\alpha_{k-1}+ \alpha_k)$. Thus

$$X_{k-1} + X_k = \frac{1}{\sum_i^k Y_i} Y_{k-1} + \frac{1}{\sum_i^k Y_i} Y_{k} = \frac{1}{\sum_i^{k-1} Y_i^\prime} Y_{k-1}^\prime$$

and, for $i < k-1$,

$$X_i = \frac{1}{\sum_i^k Y_i} Y_{k-1} = \frac{1}{\sum_i^{k-1} Y_i^\prime} Y_{k-1}^\prime.$$

Therefore $\mathbf{X}^\prime=(X_1, X_2, \ldots, X_{k-2}, X_{k-1}+X_k)$ has the same distribution as

$$\frac{1}{\sum_i^{k-1} Y_i^\prime}\left(Y_1^\prime, Y_2^\prime, \ldots, Y_k^\prime\right).$$

This demonstrates that $\mathbf{X}^\prime$ has a Dirichlet$(\alpha_1, \alpha_2, \ldots, \alpha_{k-2}, \alpha_{k-1}+\alpha_k)$ distribution, QED.


The fault in the argument in the question lies in confusing the arithmetic sum of values $x_{k-1}+x_k$ with the sum of random variables $X_{k-1}+X_k$. The latter is performed with a convolution, of course.

Related Question