[Math] How do the rows of a change of basis matrix form a basis for expressing columns

linear algebralinear-transformationsmachine learningmatricessvd

I am reading this article on Principal Component Analysis (PCA) and in section III-B (page 3) it has strange definition I don't understand.

In the toy example $\mathbf{X}$ is an $m \times n$ matrix…. Let $\mathbf{Y}$ be another $m \times n$ matrix related by a linear transformation $\mathbf{P}$. $\mathbf{X}$ is the original recorded data set and $\mathbf{Y}$ is a re-representation of that data set.

$$\mathbf{P} \mathbf{X} = \mathbf{Y} \tag{1}$$

Also let us define the following quantities.

  • $\mathbf{p}_i$ are the rows of $\mathbf{P}$.
  • $\mathbf{x}_i$ are the columns of $\mathbf{X}$ (or individual $\vec{X}$).
  • $\mathbf{y}_i$ are the columns of $\mathbf{Y}$.

Equation 1 represents a change of basis and thus can have many interpretations.

  1. $\mathbf{P}$ is a matrix that transforms $\mathbf{X}$ into $\mathbf{Y}$.
  2. Geometrically $\mathbf{P}$ is a rotation and a stretch whcih again transforms $\mathbf{X}$ into $\mathbf{Y}$.
  3. The rows of $\mathbf{P}$, $\{ \mathbf{p}_1, \ldots , \mathbf{p}_m \}$, are a set of new basis vectors for expressing the columns of $\mathbf{X}$.

I do not understand this last part, how the rows $\mathbf{p}_i$ of $\mathbf{P}$ are a set of new basis vectors for expressing the columns of $\mathbf{X}$.

The reason I don't understand latter part is that change of basis matrix usually has basis in its columns, not rows. Then multiplying by column vector on the right we get combination of matrix's columns, which is exactly representation in new basis.

So I would expect new basis to be in columns of $\mathbf{P}$, not rows of $\mathbf{P}$. What am I missing here?

Best Answer

This part is awfully explained in this article, but after some confusion eventually I came to following conclusions:

1) In the case of PCA we assume that our initial basis in which our data $X$ is expressed is identity matrix $I$ (which is orthonormal)

2) Then we apply following change of basis theorem( I am not going to prove this):

If P is a transition matrix from an orthonormal basis to another orthonormal basis,then P is orthogonal

3) In PCA data we assume that our new basis in which we will express our data is also orthonormal

4) Next theorem(easy to prove):

If A is an identity basis and B is our new basis, then change of basis matrix $P$ (from A to B): $P=B$

5)As a result of above theorems we can conclude that if we want to change orthonormal identity basis $A$ to some other orthonormal basis $B$, we have following properties: $$P=B$$ $$P^{-1}=P^{T}$$ $$B^{T}=P^{T}$$

6) So rows of our transformation matrix $P^{-1}=B^{T}$ from $A$ to $B$ are the columns(basis vectors) of our new orthonormal basis $B$. And this is fully consistent with PCA tutorial article.

Related Question