Solved – Why are there only $n-1$ principal components for $n$ data if the number of dimensions is $\ge n$

dimensionality reductioneigenvaluespca

In PCA, when the number of dimensions $d$ is greater than (or even equal to) the number of samples $N$, why is it that you will have at most $N-1$ non-zero eigenvectors? In other words, the rank of the covariance matrix amongst the $d\ge N$ dimensions is $N-1$.

Example: Your samples are vectorized images, which are of dimension $d = 640\times480 = 307\,200$, but you only have $N=10$ images.

Best Answer

Consider what PCA does. Put simply, PCA (as most typically run) creates a new coordinate system by:

shifting the origin to the centroid of your data,
squeezes and/or stretches the axes to make them equal in length, and
rotates your axes into a new orientation.

(For more details, see this excellent CV thread: Making sense of principal component analysis, eigenvectors & eigenvalues.) However, it doesn't just rotate your axes any old way. Your new $X_1$ (the first principal component) is oriented in your data's direction of maximal variation. The second principal component is oriented in the direction of the next greatest amount of variation that is orthogonal to the first principal component. The remaining principal components are formed likewise.

With this in mind, let's examine @amoeba's example. Here is a data matrix with two points in a three dimensional space:
$$ X = \bigg[ \begin{array}{ccc} 1 &1 &1 \\ 2 &2 &2 \end{array} \bigg] $$ Let's view these points in a (pseudo) three dimensional scatterplot:

enter image description here

So let's follow the steps listed above. (1) The origin of the new coordinate system will be located at $(1.5, 1.5, 1.5)$. (2) The axes are already equal. (3) The first principal component will go diagonally from $(0,0,0)$ to $(3,3,3)$, which is the direction of greatest variation for these data. Now, the second principal component must be orthogonal to the first, and should go in the direction of the greatest remaining variation. But what direction is that? Is it from $(0,0,3)$ to $(3,3,0)$, or from $(0,3,0)$ to $(3,0,3)$, or something else? There is no remaining variation, so there cannot be any more principal components.

With $N=2$ data, we can fit (at most) $N-1 = 1$ principal components.

Best Answer

Related Solutions

Regression – How Top Principal Components Retain Predictive Power for Dependent Variables: PCA and Dimensionality Reduction

Bottom line

PCA Eigenvectors – Why PCA Eigenvectors Are Orthogonal and Their Relation to Uncorrelated PCA Scores

Related Question