[Math] Change of basis, transition matrices, and representation matrices

change-of-basislinear algebralinear-transformations

The following four questions were presented to me on an introductory-level linear algebra study guide. Before diving into some conceptual questions I have, here are the questions and my attempts at solving them:

$T$ is the projection onto the vector $\vec{w}=(3,1)$ in $R^2$ which is given as the following: $$T(\vec{x})=\frac{\vec{x}\cdot\vec{w}}{\vec{w}\cdot \vec{w}}\vec{w}$$

For simplicity, I will reduce $T(\vec{x})$ to $$T(\vec{x})=\frac{\vec{x}\cdot(3,1)}{10}(3,1)$$

(a) Find the standard representation matrix $A$ corresponding to the linear transformation above.

$$T[(1,0)]=\frac{(1,0) \cdot (3,1)}{10}(3,1)=(\frac{9}{10},\frac{3}{10})$$
$$T[(0,1)]=\frac{(0,1) \cdot (3,1)}{10}(3,1)=(\frac{3}{10},\frac{1}{10})$$

Thus the standard representation matrix A is

$$A=
\begin{bmatrix}
\frac{9}{10} & \frac{3}{10} \\
\frac{3}{10} & \frac{1}{10} \\
\end{bmatrix}$$

(b) Find the transition matrix P from the non-standard basis $B=\{(1,-1),(0,1)\}$ to the standard basis.

Choosing $B'=\{(1,0),(0,1)\}$ to represent the standard matrix, the transition matrix $P$ from $B$ to $B'$ can be found by rewriting $[B'\space B]$ as $[I^3\space P]$.

$$\begin{bmatrix}
1 & 0 & 1 & 0 \\
0 & 1 & -1 & 1 \\
\end{bmatrix}$$

$$P=
\begin{bmatrix}
1 & 0 \\
-1 & 1 \\
\end{bmatrix}$$

(c) By using the formula $A'=P^{-1}AP$, find the representation matrix $A'$ that corresponds to the linear transformation above, relative to the basis $B$.

If $[B'\space B]$ reduces to $[I^3\space P]$, then $[B\space B']$ should reduce $[I^3\space P^{-1}]$.

$$\begin{bmatrix}
1 & 0 & 1 & 0 \\
-1 & 1 & 0 & 1 \\
\end{bmatrix}\rightarrow
\begin{bmatrix}
1 & 0 & 1 & 0 \\
0 & 1 & 1 & 1 \\
\end{bmatrix}$$

$$P^{-1}=
\begin{bmatrix}
1 & 0 \\
1 & 1 \\
\end{bmatrix}$$

And finally $$A'=P^{-1}AP=\begin{bmatrix} 1 & 0 \\ 1 & 1 \\ \end{bmatrix}
\begin{bmatrix} \frac{9}{10} & \frac{3}{10} \\ \frac{3}{10} & \frac{1}{10} \\ \end{bmatrix}
\begin{bmatrix} 1 & 0 \\ -1 & 1 \\ \end{bmatrix}=
\begin{bmatrix} \frac{6}{10} & \frac{3}{10} \\ \frac{8}{10} & \frac{4}{10} \\ \end{bmatrix}$$

(d) Find $T(\vec{v})$ relative to basis $B$ where $\vec{v}=(7,4)$ by using $A'$.

While normally we might use the equation $[T(\vec{v})]_{B}=A'[\vec{v}]_{B'}$, since $B'$ is the $2\times 2$ Identity Matrix, we can simplify it to

$$[T(\vec{v})]_{B}=A'\vec{v}=
\begin{bmatrix} \frac{6}{10} & \frac{3}{10} \\ \frac{8}{10} & \frac{4}{10} \\ \end{bmatrix}
\begin{bmatrix} 7 \\ 4 \\ \end{bmatrix}=
\begin{bmatrix} \frac{27}{5} \\ \frac{36}{5} \\ \end{bmatrix}$$


Firstly, what I am curious about is the choice of $P$ and $P^{-1}$. Specifically, will $P^{-1}$ always represent the transition matrix from some basis to the basis we are trying to solve $T(\vec{v})$ with respect to? Order matters when multiplying matrices (i.e. $AB$ is not necessarily equal to $BA$) so what I am getting at is that there must be a reason we picked $P$ to be the transition matrix from $B$ to $B'$.

My next question is about why I am able to get two different values for $A'$. I had the impression that I solved (d) incorrectly, so I tried it another way, where $A'=[[T[(1,0)]]_{B} | [T[(0,1)]]_{B}]$.

$T[(1,0)]=(\frac{9}{10}, \frac{3}{10})$ and $T[(0,1)]=(\frac{3}{10}, \frac{1}{10})$ so

$\begin{bmatrix} 1 & -1 \\ 0 & 1 \\ \end{bmatrix}[T[(1,0)]]_{B}=\begin{bmatrix} \frac{9}{10} \\ \frac{3}{10} \\ \end{bmatrix}\rightarrow [T[(1,0)]]_{B}=\begin{bmatrix} \frac{9}{10} \\ \frac{12}{10} \\ \end{bmatrix}$
$\begin{bmatrix} 1 & -1 \\ 0 & 1 \\ \end{bmatrix}[T[(0,1)]]_{B}=\begin{bmatrix} \frac{3}{10} \\ \frac{1}{10} \\ \end{bmatrix}\rightarrow [T[(0,1)]]_{B}=\begin{bmatrix} \frac{3}{10} \\ \frac{4}{10} \\ \end{bmatrix}$

therefore

$$A'=
\begin{bmatrix}
\frac{9}{10} & \frac{3}{10} \\
\frac{12}{10} & \frac{4}{10} \\
\end{bmatrix}$$

which, interestingly enough, is the same as $P^{-1}A$. Of course, if I plug this into the equation I used in part (d), I get a completely different answer. However, I know almost certainly that, at least for the method of obtaining $A'$ shown right above, that the equation in (d) should work in finding $[T(\vec{v})]_{B}$.

What I am thinking now is that I am either solving the wrong equation for part (d) or that I messed up somewhere in finding my transition matrices. Any clarification would be greatly appreciated.

Best Answer

I haven't checked your calculations, but your second method of computing $A'$ is wrong. By definition,

$$ A' = [T]_{\mathcal{B}}^{\mathcal{B}} = [[T(1,-1)]_{\mathcal{B}} \, | \, [T(0,1)]_{\mathcal{B}}] $$

and not what you wrote. You take each element of the basis $\mathcal{B}$, apply $T$ to it, represent the result with respect to the same basis $\mathcal{B}$ and put in in a column. Instead, you have taken the standard basis, applied $T$ to it and then represented the result with respect to a different basis.


Regarding your question about $P$, there is a notorious ambiguity between different authors regarding what is called "a change of basis matrix from $\mathcal{B}$ to $\mathcal{C}$". What some authors call $P$ might be called $P^{-1}$ by other authors and the formula for one author will look like $A' = P^{-1}AP$ while for another it will look like $A' = PAP^{-1}$. In my opinion, the change of basis formula should be written as

$$ [T]_{\mathcal{B}} = [\operatorname{id}]_{\mathcal{B}}^{\mathcal{B}'} [T]_{\mathcal{B}'}^{\mathcal{B'}} [\operatorname{id}]_{\mathcal{B}'}^{\mathcal{B}}. $$

Notice the "cancellation across diagonals" property which makes the formula easy to remember. Then, $P = [\operatorname{id}]_{\mathcal{B}'}^{\mathcal{B}}$ is the matrix whose columns are the elements of the basis $\mathcal{B}$, represented in the basis $\mathcal{B}'$. By the formula $[\operatorname{id}]_{\mathcal{B}'}^{\mathcal{B}} [v]_{\mathcal{B}} = [v]_{\mathcal{B}'}$, this matrix allows us to convert a representation of a vector in the basis $\mathcal{B}'$ to the representation of the same vector in the basis $\mathcal{B}$ by multiplying the representation $[v]_{\mathcal{B}}$ with $P$. This point of view makes it harder to confuse $P$ and $P^{-1}$.