Solved – Why does inversion of a covariance matrix yield partial correlations between random variables

covariancecovariance-matrixlinear algebramatrix inversepartial-correlation

I heard that partial correlations between random variables can be found by inverting the covariance matrix and taking appropriate cells from such resulting precision matrix (this fact is mentioned in http://en.wikipedia.org/wiki/Partial_correlation, but without a proof).

Why is this the case?

Best Answer

When a multivariate random variable $(X_1,X_2,\ldots,X_n)$ has a nondegenerate covariance matrix $\mathbb{C} = (\gamma_{ij}) = (\text{Cov}(X_i,X_j))$, the set of all real linear combinations of the $X_i$ forms an $n$-dimensional real vector space with basis $E=(X_1,X_2,\ldots, X_n)$ and a non-degenerate inner product given by

$$\langle X_i,X_j \rangle = \gamma_{ij}\ .$$

Its dual basis with respect to this inner product, $E^{*} = (X_1^{*},X_2^{*}, \ldots, X_n^{*})$, is uniquely defined by the relationships

$$\langle X_i^{*}, X_j \rangle = \delta_{ij}\ ,$$

the Kronecker delta (equal to $1$ when $i=j$ and $0$ otherwise).

The dual basis is of interest here because the partial correlation of $X_i$ and $X_j$ is obtained as the correlation between the part of $X_i$ that is left after projecting it into the space spanned by all the other vectors (let's simply call it its "residual", $X_{i\circ}$) and the comparable part of $X_j$, its residual $X_{j\circ}$. Yet $X_i^{*}$ is a vector that is orthogonal to all vectors besides $X_i$ and has positive inner product with $X_i$ whence $X_{i\circ}$ must be some non-negative multiple of $X_i^{*}$, and likewise for $X_j$. Let us therefore write

$$X_{i\circ} = \lambda_i X_i^{*},\ X_{j\circ} = \lambda_j X_j^{*}$$

for positive real numbers $\lambda_i$ and $\lambda_j$.

The partial correlation is the normalized dot product of the residuals, which is unchanged by rescaling:

$$\rho_{ij\circ} = \frac{\langle X_{i\circ}, X_{j\circ} \rangle}{\sqrt{\langle X_{i\circ}, X_{i\circ} \rangle\langle X_{j\circ}, X_{j\circ} \rangle}} = \frac{\lambda_i\lambda_j\langle X_{i}^{*}, X_{j}^{*} \rangle}{\sqrt{\lambda_i^2\langle X_{i}^{*}, X_{i}^{*} \rangle\lambda_j^2\langle X_{j}^{*}, X_{j}^{*} \rangle}} = \frac{\langle X_{i}^{*}, X_{j}^{*} \rangle}{\sqrt{\langle X_{i}^{*}, X_{i}^{*} \rangle\langle X_{j}^{*}, X_{j}^{*} \rangle}}\ .$$

(In either case the partial correlation will be zero whenever the residuals are orthogonal, whether or not they are nonzero.)

We need to find the inner products of dual basis elements. To this end, expand the dual basis elements in terms of the original basis $E$:

$$X_i^{*} = \sum_{j=1}^n \beta_{ij} X_j\ .$$

Then by definition

$$\delta_{ik} = \langle X_i^{*}, X_k \rangle = \sum_{j=1}^n \beta_{ij}\langle X_j, X_k \rangle = \sum_{j=1}^n \beta_{ij}\gamma_{jk}\ .$$

In matrix notation with $\mathbb{I} = (\delta_{ij})$ the identity matrix and $\mathbb{B} = (\beta_{ij})$ the change-of-basis matrix, this states

$$\mathbb{I} = \mathbb{BC}\ .$$

That is, $\mathbb{B} = \mathbb{C}^{-1}$, which is exactly what the Wikipedia article is asserting. The previous formula for the partial correlation gives

$$\rho_{ij\cdot} = \frac{\beta_{ij}}{\sqrt{\beta_{ii} \beta_{jj}}} = \frac{\mathbb{C}^{-1}_{ij}}{\sqrt{\mathbb{C}^{-1}_{ii} \mathbb{C}^{-1}_{jj}}}\ .$$

Related Solutions

Solved – lme4::glmer : Get the covariance matrix of the fixed and random effect estimates

Unless you've gone out of your way to not compute the Hessian, it's hiding in the output model structure. You can look in lme4:::vcov.merMod to see where these computations come from (what's there is more complicated because it handles a bunch of edge cases; it also extracts just the part of the covariance matrix relevant to the fixed effects ...)

Example:

library(lme4)
object <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
                   data = cbpp, 
                  family = binomial)

This extracts the Hessian, inverts it, and doubles it (since the Hessian is computed on the (-2 log likelihood) scale. The h+t(h) is a clever way to improve symmetry while doubling (if I recall correctly ...)

h <- object@optinfo$derivs$Hessian
h <- solve(h)
v <- forceSymmetric(h + t(h))

Check that the fixed-effect part agrees (random-effect parameters come first):

all.equal(unname(as.matrix(vcov(object))),
          unname(as.matrix(v)[-1,-1])) ## TRUE

Warning: the random effects are parameterized on the Cholesky scale (i.e., the parameters are the lower triangle, in column-major order, of the Cholesky factor of the random effect covariance matrix) ... if you need this in variance-covariance parameterization, or in standard deviation-correlation parameterization, it's going to take more work. (If you only have a single scalar random effect, then the parameter is the standard deviation.)

Best Answer

Related Solutions

Solved – lme4::glmer : Get the covariance matrix of the fixed and random effect estimates

Related Question