[Math] Intuition why Eigenvector of Covariance matrix points into direction of maximum variance

covarianceeigenvalues-eigenvectorslinear algebra

In context of principal component analysis, I am using the Eigenvectors of the Covariance matrix to project the data.
I am able to prove that the Eigenvector of the Covariance Matrix is the direction of the greatest variance in the original data. However I am wondering, is there is an intuitive explanation for this fact?

Best Answer

Yes, the way I captured this is:

Nature has no metric system in itself, so when you measure something, you're doing it through a super-imposed metric that does not, in principle, have any meaning

However, one could measure things in a "more natural way" taking the distance from the mean divided by the standard deviation, let me explain this to you with an example

Suppose you see a man which is 2.10 meters tall, we all would say that he is a very tall man, not because of the digits "2.10" but because (unconsciously) we know that the average height of a human being is (I'm making this up) 1.80m and the standard deviation is 8cm, so that this individual is "3.75 standard deviations far from the mean"

Now suppose you go to Mars and see an individual which is 6 meters tall, and a scientist tells you that the average height of martians is 5.30 meters, would you conclude that this indidual is "exceptionally tall"? The answer is: it depends on the variability! (i.e. the standard deviation)

So that, one natural way measure things is the so called Mahalanobis distance

$$\Sigma \text{ be a positive def. matrix (in our case it will be the covariance matrix)} \quad d(x,\mu)=(x-\mu)^T\Sigma^{-1}(x-\mu) $$

This mean that the contour levels (in the euclidean rappresentation) of the distance of points $X_i$ from their mean $\mu$ are ellipsoid whose axes are the eigenvector of the matrix $\Sigma$ and the lenght of the axes is proportional to the eigenvalue associated with eigenvector

So that to larger eigenvalue is associated longer axis (in the euclidean distance!) which means more variability in that direction

Here is an example for n=2

Related Question