Solved – PCA loading vectors and importance of original variables

pca

If $X$ is your data set with columns $x_i$ representing all your variables, then principal component analysis (PCA) of $X$ gives $$T=XW,$$ where $T$ represents scores and $W$ represents loadings (columns $w_i$ of $W$ are eigenvectors of the covariance matrix).

Can we say that $w_i$ vector represents the importance of $x_i$? Or is this information only conveyed by the scores? I am confused by the fact that $w_i$'s represent a totally different space than that of $x_i$'s, albeit they being the projection of the $x_i$.

Best Answer

Let's start by looking at your equation.

As an example, consider a dataset with $4$ variables and $100$ data points, so that $X$ is of size $100\times 4$ (and centered). PCA constructs $4\times 4$ covariance matrix and finds its eigenvectors. Suppose we selected $2$ eigenvectors to perform the dimensionality reduction. Then $W$ is of size $4 \times 2$. Multiplying $X$ by $W^\top$ (note the transpose!), we get a $100\times 2$ matrix of scores: $$T=XW^\top,$$ or spelled out:

$$\underbrace{\left(\begin{array}{cc} |&|\\|&|\\t_1&t_2\\|&|\\|&|\end{array}\right)}_T=\underbrace{\left(\begin{array}{cc} |&|&|&|\\|&|&|&|\\x_1&x_2&x_3&x_4\\|&|&|&|\\|&|&|&|\end{array}\right)}_X\cdot {\underbrace{\left(\begin{array}{cc} |&|\\w_1&w_2\\|&|\end{array}\right)}_W}^\top.$$

Can we say that $w_i$ vector represents the importance of $x_i$?

Absolutely not! In my example, there are four $x_i$ variables, but only two $w$ vectors. There is no correspondence between a particular $x_k$ and $w_k$ at all.

Or is this information only conveyed by the scores?

No! Scores $T$ don't tell you anything about the importance of the original variables.


In fact, nothing in PCA tells you about the "importance" of the original variables.

PCA is sometimes used for feature selection, see here: Using principal component analysis (PCA) for feature selection -- this is based on the assumption that the variables contributing most to PC1 are most "important", i.e. it is the elements of $w_1$ that reflect the "importance" of the original variables. However, there is no guarantee that this assumption should always be reasonable.

Related Question