Projection of Dirichlet distribution

probabilityprobability distributions

Suppose we have a random vector $(X_1,\dots,X_n)\sim Dir(\alpha_1,\dots,\alpha_n)$.

What is the distribution of $(X_1,\dots,X_n\mid X_1=x)$?


Intuitively, I would think a set value of $X_1$ does not affect the proportion between other random variables in the vector, i.e., the remaining random variables are distributed according to a scaled down version of a Dirichlet distribution: $(X_2,\dots,X_n)\propto (1-x) Dir(\alpha_2,\dots,\alpha_n)$. However, I am not sure how to prove this rigoroulsy.

Under the assumption that the pdf $f'$ of the Dirichlet distributed with $X_1$ projected away obeys $f(x_2,\dots x_n;\alpha_2,\dots,\alpha_n)\sim \prod_{i=2}^n x_i^{\alpha_i-1}$ it is easy to show that the claim above holds, but I am again not sure how to argue that this assumption indeed holds.


Bulding on @Renato Fernandes comment:

By conditional probability we have

$f_{X_2,\dots,X_n\mid X_1}(x_2,\dots,x_n\mid x)=\frac{f_{X_1,\dots,X_n}(x,x_2,\dots,x_n)}{f_{X_1}(x)}$

where the pdf in the numerator is the standard Dirichlet pdf and in the denonimator we have a Beta distribution pdf as $X_1\sim B(\alpha_1,\sum_{i=2}^n \alpha_i)$. Plugging this in:

$
\begin{align}f_{X_2,\dots,X_n\mid X_1}(x_2,\dots,x_n\mid x)
=&\frac{f_{X_1,\dots,X_n}(x,x_2,\dots,x_n)}{f_{X_1}(x)} \\
=&\frac{\frac{1}{B(\alpha_1,\dots,\alpha_n)}\prod_{i=1}^n x_i^{a_i-1}}{\frac{1}{B(\alpha_1,\sum_{i=2}^n \alpha_i)}x^{\alpha_1-1}(1-x)^{\sum_{i=2}^n \alpha_i}} \\
=&\left(\frac{1}{B(\alpha_2,\dots,\alpha_n)} \prod_{i=2}^n x_i^{\alpha_i-1}\right) \frac{1}{(1-x)^{\sum_{i=2}^n \alpha_i}}
\end{align}
$

Now the first part is almost the pdf of a Dirichlet again. The subtle difference is that the support of this pdf is defined by $\sum_{i=2}^n x_i = 1-x$ instead of the sum being $1$ as for a standard Dirichlet distribution. We can fix this by scaling each variable and define $x'_i=x_i/(1-x)$. We then have

$f_{X_2,\dots,X_n\mid X_1}(x_2,\dots,x_n\mid x)=\left(\frac{1}{B(\alpha_2,\dots,\alpha_n)} \prod_{i=2}^n x_i'^{\alpha_i-1}\right)\frac{1}{(1-x)^{n-2}}$

Now here's where I'm confused: As this is still a pdf, this should integrate to $1$. The first (bracketed) part is a standard Dirichlet distribution as we made sure the support is exactly all vectors $(x_2,\dots,x_n)$ summing to $1$. So this should also integrate to $1$. However, the second term is constant for a given $x$, so the integral would be $\frac{1}{(1-x)^{n-2}}$ which cannot be. So where is my mistake in the derivation?

The mismatch seems to stem from the fact that the exponents in the pdf are reduced by $1$, but when determining the parameters of the marginal distribution of $X_1$, we just sum all parameters $\alpha$.

Best Answer

You have some small mistakes:

  • First the beta distribution density is: $$f_{X_1}\left(x\right) = \frac1{B\left(\alpha_1, \sum_{i=2}^n \alpha_{i}\right)} x^{\alpha_1 - 1} (1-x)^{\sum_{i=2}^n \alpha_i - 1}$$

  • Second when you do a subtitution you need to multiply by the Jacobian of the transformation. Let $U_j = \displaystyle\frac{X_j}{1-x}$, $$f_{U_2,\ldots,U_n}\left(u_2,\ldots,u_n\right) = f_{X_2,\ldots,X_n | X_1}\left(x_2,\ldots,x_n|x\right) \times \left|\frac{\partial U}{\partial X}\right|^{-1} = \cdots$$

Can you continue from here?