PCA – Understanding Principal Components and Major Axis of the Ellipse Corresponding to the Covariance Matrix

covariance-matrixeigenvaluespca

Based on my understanding, in PCA, we try to find a linear combination of axes such that the variance in that direction is maximized. If variables have the covariance matrix $\Sigma$, then, the first component is in the direction of the eigenvalue corresponding to the largest eigenvalue. So, eigenvalue, variance, and information are the same.

On the other hand, if we just consider the covariance matrix $\Sigma$, and form the ellipse, we will have: $\mathbf{x}^T\Sigma\mathbf{x}=1$ (a set of points with a distance equal to 1). In this situation, the length of the major axis in which we have the most variability is determined based on the $\frac{1}{\sqrt{\lambda_1}}$ where $\lambda_1$ is the smallest eigenvalue $(\lambda_1 < \lambda_2<\cdots <\lambda_n)$. I am confused why in this case, the smallest eigenvalue corresponds to the largest variation and the most important direction?

I would be thankful if you can correct me in case of making a mistake.

Best Answer

You are right, the ellipse given by $\mathbf x^T\Sigma\mathbf x = 1$ has the largest axis along the smallest eigenvalue. So it is kind of the "opposite" of the curves from contour plots of e.g. the normal density. But note, that the normal density uses the inverse of $\Sigma$: $$ p(\mathbf x) = \frac{1}{\sqrt{det(2\pi\Sigma)}}\exp(-\frac{1}{2}\mathbf x^T\Sigma^{-1}\mathbf x), $$ so the points of equal density are those given by $\mathbf x^T\Sigma^{-1}\mathbf x$, i.e. the direction of the largest eigenvalue will be the largest axis of the ellipse.