[Math] Distribution of joint Gaussian conditional on their sum

normal distributionprobabilityprobability distributionsstatistics

Let $X = (X_1, X_2, \dots, X_n)$ be jointly Gaussian with mean vector $\mu$ and covariance matrix $\Sigma$. Let $S$ be their sum.

I know that the distribution of each $X_i \mid S = s$ is also Gaussian.

When $n=2$, I know that
$$
E\left( X_1\mid S = s \right) = s \frac{\sigma_1^2}{\sigma_1^2 + \sigma_2^2}
$$

and
$$
V\left(X_1\mid S = s \right) = \frac{\sigma_1^2\sigma_2^2}{\sigma_1^2 + \sigma_2^2}
$$

(see here and here). I could probably work out analogous expressions for an arbitrary $n$ if I sat down with a pencil and paper and worked at it for a bit.

What I want to know is, what is the distribution of $X$ given $S = s$?

I know that this can't be Gaussian, since the sum is bounded. It's clearly not Dirichlet or anything Dirichlet-esque, since the marginal distributions are Gaussian. But beyond that I don't have a clue.

Best Answer

Let A be a deterministic matrix of size $n\times n$ and let $v$ be a vector of size $n$. The random vector $(AX, S)$ is jointly normal. The idea is to construct both

  1. a matrix $A$ such that $AX$ is independent from $S$, and
  2. a vector $v$ such that $X = AX + Sv$.

Why? Then by independence we have a crystal-clear description of the distribution of $X$ given $S=s$: The distribution of $X$ given $S=s$ is normal $$N(sv + A\mu, A\Sigma A^T)$$.

Now let's find such $A$ and $v$.

  • Since $(AX,S)$ are jointly normal, $AX$ is independent from $S$ if and only if their covariance matrix is zero, that is, $E[A(X-\mu)(S - E[S])]=0$. If $u=(1,...,1)\in R^n$, this is equivalent to $E[A(X-\mu)(X-\mu)^T u] = A\Sigma u= 0$.
  • For $v$, the relationship $X=AX +vS$ is satisfied provided that $I_n = A + vu^T$, where again $u$ is the vector $(1,...,1)$. Since $A\Sigma u =0$, multypling this by $\Sigma u$ implies that $$v = \frac{1}{u^T\Sigma u} \Sigma u.$$ Now set $$A = I_n - v u^T.$$ One readily verifies than such choice of $A$ indeed satisfies $A\Sigma u = 0$, and we have constructed $A$ and $v$ that satisfy the requirements.

More Generally: the distribution of $X$ given $UX=b$ for some matrix $U$

If $U$ is a $k\times n$ matrix of rank $k$ and we would like to find the distribution of $X$ conditionally on $UX$, the same technique can be extended.

(In the above example, $U$ is an $1\times n$ matrix equal to $u^T$.)

We proceed similarly: we look for deterministic matrix $A$ and a $n\times k$ matrix $C$ such that

  1. $AX$ and $UX$ are independent, and
  2. $I_n = A + CU$ so that $X = AX + CUX$ always holds.

Why? If we can find such matrices $A$ and $C$, then the distribution of $X$ given $UX=b$ is normal $$N(A\mu + Cb, A^T\Sigma A).$$

Since $AX$ and $UX$ are jointly Normal, the first condition holds if and only if $E[A(X-\mu)(U(X-\mu))^T] = A\Sigma U^T = 0$. Multiyplying the second condition by $\Sigma U^T$, it must be that $\Sigma U^T = C U \Sigma U^T$, hence $$C = \Sigma U^T (U\Sigma U^T)^{-1} .$$

Finally, define $$A = I_n - CU,$$ and check that this choice of $A$ and $C$ indeed satisfy $A\Sigma U^T = 0$ and the above requirements.

(By the way, the matrix $U\Sigma U^T$ is indeed invertible whenever $U$ is of full rank $k<n$ and $\Sigma$ is invertible. The matrix $\Sigma$ is invertible if and only if $X$ a continuous distribution in $R^n$ in the sens that it has a density with respect to the Lebesgue measure in $R^n$.)

Related Question