[Math] Covariance matrix of a random variable and its affine transformation

covariancelinear-transformationsrandom variables

Given:

  • the random variable $\mathbf{X} \sim \mathcal{N}(\mathbf{c, \Gamma})$ where $\mathbf{X} \in \mathbb{R}^L$
  • its affine transformation $\mathbf{Y} = \mathbf{A} \mathbf{X} + \mathbf{b} + \mathbf{E}$ where , $\mathbf{Y} \in \mathbb{R}^D$, $\mathbf{E} \sim \mathcal{N} (\mathbf{0}, \mathbf{\Sigma})$ and $\mathbf{A} \in \mathbb{R}^{D \times L}$ and $\mathbf{b} \in \mathbb{R}^D$

I want compute the joint distribution of $\mathbf{X}, \mathbf{Y}$.
To do that, I am computing the following:

$$\begin{bmatrix}
\mathbf{X}
\\
\mathbf{Y}
\end{bmatrix} \sim \mathcal{N} \left(
\begin{bmatrix}
\mathbf{c}
\\
\mathbf{A c + b}
\end{bmatrix},
\begin{bmatrix}
\mathbf{\Gamma} & \mathbf{R_{X,Y}}
\\
\mathbf{R_{Y,X}} & \mathbf{\Sigma + A \Gamma} \mathbf{A}^T
\end{bmatrix}
\right)$$

Where the I have computed the variance thanks to the properties of the affine transformations.
However I am having troubles on how to compute $\mathbf{R_{X,Y}}$ ( $ = \mathbf{R_{Y,X}}^T $).

Can someone help me? Or at least give some hint on solve using the following formula?

$$\mathbf{R_{X,Y}} = \mathbb{E}[(\mathbf{X} – \mathbb{E}[\mathbf{X}])([\mathbf{Y} – \mathbb{E}[\mathbf{Y}])^T]$$

I will need it to compute later the conditional distribution of $\mathbf{X}$ given $\mathbf{Y}$ using well know formulas (e.g. conditional distribution of gaussian process )

Best Answer

We have \begin{align*} \mathbf{R}_{\mathbf{X}, \mathbf{Y}} &= \text{Cov}(\mathbf{X}, \mathbf{Y}) = \text{Cov}(\mathbf{X}, A\mathbf{X} + \mathbf{b} + \mathbf{E}) \\ &=\text{Cov}(\mathbf{X}, A\mathbf{X}) + \text{Cov}(\mathbf{X}, \mathbf{b}) + \text{Cov}(\mathbf{X}, \mathbf{E}) \\ &=\text{Cov}(\mathbf{X}, \mathbf{X})A^\intercal + \mathbf{O} + \text{Cov}(\mathbf{X}, \mathbf{E}) \\ &=\mathbf{\Gamma}A^\intercal + \text{Cov}(\mathbf{X}, \mathbf{E}) \end{align*} If $\mathbf{X}$ and $\mathbf{E}$ are known to be independent, then $\text{Cov}(\mathbf{X}, \mathbf{E}) = \mathbf{O}$ and we have $\mathbf{R}_{\mathbf{X}, \mathbf{Y}} = \mathbf{\Gamma}A^\intercal$.

Related Question