Linear Algebra – Why is the ‘Change-of-Basis Matrix’ Called Such

linear algebra

"Let $P$ be the change-of-basis matrix
from a basis $S$ to a basis $S'$ in a
vector space $V$. Then, for any vector
$v \in V$, we have $$P[v]_{S'}=[v]_{S}
\text{ and hence, } P^{-1}[v]_{S} =
[v]_{S'}$$

Namely, if we multiply the coordinates
of $v$ in the original basis $S$ by
$P^{-1}$, we get the coordinates of
$v$ in the new basis $S'$." – Schaum's
Outlines: Linear Algebra. 4th Ed.

I am having a lot of difficulty keeping these matrices straight. Could someone please help me understand the reasoning behind (what appears to me as) the counter-intuitive naming of $P$ as the change of basis matrix from $S$ to $S'$? It seems like $P^{-1}$ is the matrix which actually changes a coordinate vector in terms of the 'old' basis $S$ to a coordinate vector in terms of the 'new' basis $S'$…

Added:

"Consider a basis $S =
\{u_1,u_2,…,u_n\}$ of a vector space
$V$ over a field $K$. For any vector
$v\in V$, suppose $v = a_1u_1
+a_2u_2+…+a_nu_n$

Then the coordinate vector of $v$
relative to the basis $S$, which we
assume to be a column vector (unless
otherwise stated or implied), is
denoted and defined by $[v]_S =
[a_1,a_2,…,a_n]^{T}$. "

"Let $S = \{ u_1,u_2,…,u_n\}$ be a
basis of a vector space $V$, and let
$S'=\{v_1,v_2,…,v_n\}$ be another
basis. (For reference, we will call
$S$ the 'old' basis and $S'$ the 'new'
basis.) Because $S$ is a basis, each
vector in the 'new' basis $S'$ can be
written uniquely as a linear
combination of the vectors in S; say,

$\begin{array}{c} v_1 = a_{11}u_1 +
a_{12}u_2 + \cdots +a_{1n}u_n \\ v_2 =
a_{21}u_1 + a_{22}u_2 + \cdots
+a_{2n}u_n \\ \cdots \cdots \cdots \\ v_n = a_{n1}u_1 + a_{n2}u_2 + \cdots
+a_{nn}u_n \end{array}$

Let $P$ be the transpose of the above
matrix of coefficients; that is, let
$P = [p_{ij}]$, where $p_{ij} =
a_{ij}$. Then $P$ is called the
\textit{change-of-basis matrix} from
the 'old' basis $S$ to the 'new' basis
$S'$." – Schaum's Outline: Linear Algebra 4th Ed.

I am trying to understand the above definitions with this example:

Basis vectors of $\mathbb{R}^{2}: S= \{u_1,u_2\}=\{(1,-2),(3,-4)\}$ and $S' = \{v_1,v_2\}= \{(1,3), (3,8)\}$ the change of basis matrix from $S$ to $S'$ is $P = \left( \begin{array}{cc} -\frac{13}{2} & -18 \\ \frac{5}{2} & 7 \end{array} \right)$.

My current understanding is the following: normally vectors such as $u_1, u_2$ are written under the assumption of the usual basis that is $u_1 = (1,-2) = e_1 – 2e_2 = [u_1]_E$. So actually $[u_1]_S = (1,0)$ and I guess this would be true in general… But I am not really understanding what effect if any $P$ is supposed to have on the basis vectors themselves (I think I understand the effect on the coordinates relative to a basis). I guess I could calculate a matrix $P'$ which has the effect $P'u_1, P'u_2,…,P'u_n = v_1, v_2,…, v_n$ but would this be anything?

Best Answer

The situation here is closely related to the following situation: say you have some real function $f(x)$ and you want to shift its graph to the right by a positive constant $a$. Then the correct thing to do to the function is to shift $x$ over to the left; that is, the new function is $f(x - a)$. In essence you have shifted the graph to the right by shifting the coordinate axes to the left.

In this situation, if you have a vector $v$ expressed in some basis $e_1, ... e_n$, and you want to express it in a new basis $Pe_1, .... Pe_n$ (this is why $P$ is called the change of basis matrix), then you multiply the numerical vector $v$ by $P^{-1}$ in order to do this. You should carefully work through some numerical examples to convince yourself that this is correct. Consider, for example, the simple case that $P$ is multiplication by a scalar.

The lesson here is that one must carefully distinguish between vectors and the components used to express a vector in a particular basis. Vectors transform covariantly, but their components transform contravariantly.