PCA – How to Show That Linear Combination Has Maximum Variance

linearpcavariance

Let $$x_1,…, x_n$$ be random variables with a covariance matrix $\Sigma$.

  1. $
    a^Tx:=\sum_{i=1}^na_ix_i
    $

  2. $
    \sum a_i^2=1
    $

how do I show that the linear combination of 1 and 2 has maximal variance when a is an eigenvector of $\Sigma$ with maximal eigenvalue?

I was reading all day trying to understand the steps and I'm sure there is something that I'm missing

Best Answer

There is a good, long, explanation of PCA at Making sense of principal component analysis, eigenvectors & eigenvalues so here I will only show why the eigenvector with largest eigenvalue maximizes the variance of linear combinations.

Let $\Sigma$ be the variance-covariance matrix of the random variables $X_1, X_2, \dotsc, X_n$, which we will let be the components of random vector $X$. That is, $$ \DeclareMathOperator{\V}{\mathbb{V}} \DeclareMathOperator{\C}{\mathbb{C}} \C(X_i,X_j)=\Sigma_{ij} $$ Since $`sigma$ is a symmetric matrix, we can use the spectral decomposition, that is $$ \Sigma = U \Lambda U^T = \sum_{i=1}^n \lambda_i^2 u_i u_i^T $$ where $U$ is orthogonal and $\lambda$ is diagonal. Now let $a$ be an $n$vector defining the linear combination $a^T X$. We want to choose $a$ so at to maximize the variance of the linear combination: $$ \V\left( a^T X\right) = a^T \V\left( X \right) = a^T \Sigma a \\ = a^T \sum_{i=1}^n \lambda_i^2 u_i u_i^T a \\ = \sum_{i=1}^n \lambda_i^2 (a^T u_i) (u_i^T a) $$ The vectors $u_i$ are mutually ortonormal, the sum above can be seen as an average of the $n$ values $\lambda_i^2$, and it is clear that it will be maximized if we choose $a$ so that the highest of the values $\lambda_1^2$ get all the weight. That will happen if we choose $a$ to be the $u_i$-eigenvector corresponding to the largest eigenvalue $\lambda_i$. But then the weight of the rest will be zero, by orthonormality. That finishes the proof.