Solved – Decomposition of inverse covariance matrix

covariance-matrixlinear algebramatrix decomposition

Let $\Sigma$ be a covariance matrix and let

$$x^T \Sigma^{-1} x = \|Ax\|_2^2.$$ What is the interpretation of matrix $A$?

I tried solving for $A$ with an eigenvalue decomposition of $\Sigma$ as follows,

$$x^T \Sigma^{-1} x = x^T (Q \Lambda Q^T)^{-1}x =
x^T Q\Lambda^{-1}Q^{-1} x = x^TA^TAx,$$
and this is where I get stuck. I am unable to figure out a closed form solution to $A$.

Best Answer

$x^T \Sigma^{-1} x$ is a quadratic form, closely related to the notion of Mahalanobis distance. It's kind of a generalization of standard Euclidean distance in case when not all directions are equal. This is useful when the variation along one direction is much bigger that along the other one, so small distance along the later one is quite significant.

One can decompose $\Sigma^{-1}$ into $Q^T \Lambda Q = Q^T \Lambda^{1/2} \Lambda^{1/2} Q = (\Lambda^{1/2} Q)^T (\Lambda^{1/2} Q)$ (where $\Lambda$ is diagonal matrix of inverted eigenvalues of $\Sigma$, and $Q$ is orthogonal), so when applied to vectors,

$$ x^T \Sigma^{-1} x = x^T (\Lambda^{1/2} Q)^T (\Lambda^{1/2} Q) x = (\Lambda^{1/2} Q x)^T (\Lambda^{1/2} Q x) = \|\Lambda^{1/2} Q x\|^2 $$

What this formula says is that this quadratic form is equivalent to the standard Euclidean distance in new, morphed space. Namely, we apply the transformation $\Lambda^{1/2} Q$ to transform vector $x$, and then take the norm.

What does this transformation do to our $x$? Well, it consists of 2 "sub-transformation": the first one is a multiplication by $Q$, which just rotates the vector (this is what orthogonal matrices do), preserving its norm; and the second one is a multiplication by $\Lambda^{1/2}$ which stretches vectors along each axes in proportion to the square roots of diagonal elements of $\Lambda$.

enter image description here

The picture above shows what will happen with your 2D datapoints if you apply this transformation (for some $\Sigma$). You can think of Mahalanobis distance as of distance, which implicitly maps your vectors into a new space, measures its norm there and returns back to you.

Related Question