Solved – Why does inversion of a covariance matrix yield partial correlations between random variables

covariancecovariance-matrixlinear algebramatrix inversepartial-correlation

I heard that partial correlations between random variables can be found by inverting the covariance matrix and taking appropriate cells from such resulting precision matrix (this fact is mentioned in http://en.wikipedia.org/wiki/Partial_correlation, but without a proof).

Why is this the case?

Best Answer

When a multivariate random variable $(X_1,X_2,\ldots,X_n)$ has a nondegenerate covariance matrix $\mathbb{C} = (\gamma_{ij}) = (\text{Cov}(X_i,X_j))$, the set of all real linear combinations of the $X_i$ forms an $n$-dimensional real vector space with basis $E=(X_1,X_2,\ldots, X_n)$ and a non-degenerate inner product given by

$$\langle X_i,X_j \rangle = \gamma_{ij}\ .$$

Its dual basis with respect to this inner product, $E^{*} = (X_1^{*},X_2^{*}, \ldots, X_n^{*})$, is uniquely defined by the relationships

$$\langle X_i^{*}, X_j \rangle = \delta_{ij}\ ,$$

the Kronecker delta (equal to $1$ when $i=j$ and $0$ otherwise).

The dual basis is of interest here because the partial correlation of $X_i$ and $X_j$ is obtained as the correlation between the part of $X_i$ that is left after projecting it into the space spanned by all the other vectors (let's simply call it its "residual", $X_{i\circ}$) and the comparable part of $X_j$, its residual $X_{j\circ}$. Yet $X_i^{*}$ is a vector that is orthogonal to all vectors besides $X_i$ and has positive inner product with $X_i$ whence $X_{i\circ}$ must be some non-negative multiple of $X_i^{*}$, and likewise for $X_j$. Let us therefore write

$$X_{i\circ} = \lambda_i X_i^{*},\ X_{j\circ} = \lambda_j X_j^{*}$$

for positive real numbers $\lambda_i$ and $\lambda_j$.

The partial correlation is the normalized dot product of the residuals, which is unchanged by rescaling:

$$\rho_{ij\circ} = \frac{\langle X_{i\circ}, X_{j\circ} \rangle}{\sqrt{\langle X_{i\circ}, X_{i\circ} \rangle\langle X_{j\circ}, X_{j\circ} \rangle}} = \frac{\lambda_i\lambda_j\langle X_{i}^{*}, X_{j}^{*} \rangle}{\sqrt{\lambda_i^2\langle X_{i}^{*}, X_{i}^{*} \rangle\lambda_j^2\langle X_{j}^{*}, X_{j}^{*} \rangle}} = \frac{\langle X_{i}^{*}, X_{j}^{*} \rangle}{\sqrt{\langle X_{i}^{*}, X_{i}^{*} \rangle\langle X_{j}^{*}, X_{j}^{*} \rangle}}\ .$$

(In either case the partial correlation will be zero whenever the residuals are orthogonal, whether or not they are nonzero.)

We need to find the inner products of dual basis elements. To this end, expand the dual basis elements in terms of the original basis $E$:

$$X_i^{*} = \sum_{j=1}^n \beta_{ij} X_j\ .$$

Then by definition

$$\delta_{ik} = \langle X_i^{*}, X_k \rangle = \sum_{j=1}^n \beta_{ij}\langle X_j, X_k \rangle = \sum_{j=1}^n \beta_{ij}\gamma_{jk}\ .$$

In matrix notation with $\mathbb{I} = (\delta_{ij})$ the identity matrix and $\mathbb{B} = (\beta_{ij})$ the change-of-basis matrix, this states

$$\mathbb{I} = \mathbb{BC}\ .$$

That is, $\mathbb{B} = \mathbb{C}^{-1}$, which is exactly what the Wikipedia article is asserting. The previous formula for the partial correlation gives

$$\rho_{ij\cdot} = \frac{\beta_{ij}}{\sqrt{\beta_{ii} \beta_{jj}}} = \frac{\mathbb{C}^{-1}_{ij}}{\sqrt{\mathbb{C}^{-1}_{ii} \mathbb{C}^{-1}_{jj}}}\ .$$

Related Question