Understanding in depth change of basis

change-of-basisgeometrylinear algebralinear-transformationsmatrices

Good morning/everyone,

I'm writing to understand the process of changing of basis,fundamental in my course of linear algebra even for specific equivalence relations and 'Quotiens'.

I saw there are some questions down-voted about but I'd like to see the whole process with you to examine it in detail.

One prerequisite is the Coordinates Isomorphism, which, given a vector space V of dimension n, sends one vector v in in the column of n scalar that 'composed' that vector,

In the sense that if the vector was $v = a_{1}v_{1}+ \cdots + a_{n}v_{n}$ sends $v \longmapsto \begin{pmatrix} a_{1}\\ . \\ . \\ . \\ a_{n} \end{pmatrix}$

So :$$[]_{B} : V $\longmapsto \mathbb{K}^{n}$$ define as above is our isomorphism.

So now we can consider f : $V \longmapsto W$, f $\in$ HOM(V,W) and we can suppose to fix a bais $B = \{v_{1} \cdots v_{n} \}$ of V and an other $D = \{w_{1} \cdots w_{n} \} $ of W.

Observe that in this way we have implicitly said that dim V = n and dim W = m.

Through the Coordinates Isomorphisms we have the following diagramm :

$\require{AMScd}$
\begin{CD}
V @>{{f}}>> W\\
@V []_{B}VV @VV []_{D}V\\
\mathbb{K}^{n} @>{M_{D}^{B}}>> \mathbb{K}^{m}
\end{CD}

Where $M_{D}^{B}$ represents the matrix of changing basis from the basis B to the basis D.

At this point knowing that the composition of linear function is still linear we know that $$[]_{D} \circ f \circ []_{{B}^{-1}} \in HOM(\mathbb{K}^{n},\mathbb{K}^{m})$$

And we identify $[]_{D} \circ f \circ []_{{B}^{-1}}$ with $M_{D}^{B}$ and here's my question.

Observe that $[]_{{B}^{-1}}$ exists because $[]_{B}$ is an isomorphism.

The prove follows saying that we know everything of this matrix, in the sense that we know the link between this M and the application, because :

$\forall v \in V [f(v)]_{D} = M_{D}^{B} (f) \circ [v]_{B}$ and so the matrix it's of the form : $$ \begin{pmatrix} [f(v_{1})]_{D} && \cdots && [f(v_{n})]_{D} \end{pmatrix}$$.

I don't get this passage,and so why the matrix has to be in that form, i know we're making with our hands a matrix that respects the relations, but i still don't see clearly the logic behind that brings my matrix forcely to be in that form.

Seeing the problem in a more 'handy' way we could :

  • Consider the first vector of the basis V, $v_{1}$
  • Calculate its image thorugh $f$, $ f(v_{1}) \in W$
  • Calculate the Coordinate of $ f(v_{1})$ respect to the basis D.

And this is the Matrix of change basis.

So $A^{i} = [f(v_{i})]_{D}$ and we can write $f(v_{i}) = A_{1}^{i} w_{1} + \cdots A_{m}^{i} w_{m}$.

(This is i think because every vector in W can be written an element of the basis W, but i don't get again the relation with $A^{i}$ , or the form $A_{j}^{i}$ for $j = 1 \cdots m$ again.)

Then we take $v \in V$, with $$v = a_{1}v_{1}+ \cdots + a_{n}v_{n}$$

Applying $f \in HOM(V,W) $ : $$f(v) = a_{1}f(v_{1})+ \cdots + a_{n}f(v_{n})$$

The crucial passage starts now when we equals this to : $$a_{1}(A_{1}^{1} w_{1} + \cdots A_{m}^{1} w_{m}) + \cdots + a_{n}(A_{1}^{n} w_{1} + \cdots A_{m}^{n} w_{m})$$

I see this all follow from $f(v_{i}) = A_{1}^{i} w_{1} + \cdots A_{m}^{i} w_{m}$ i think, that is the form i am unable to understand.

This equivalent form finishes saying that : $$a_{1}(A_{1}^{1} w_{1} + \cdots A_{m}^{1} w_{m}) + \cdots + a_{n}(A_{1}^{n} w_{1} + \cdots A_{m}^{n} w_{m}) = (a_{1}A_{1}^{1} + \cdots a_{n}A_{1}^{n}) w_{1}+ \cdots + (a_{1}A_{1}^{m} + \cdots a_{n}A_{m}^{n}) w_{m})$$

(By the way, are all the index correct?)

And consequently $$[f(v)]_{D} = \begin{pmatrix} (a_{1}A_{1}^{1} + \cdots + a_{n}A_{1}^{n}) \\ . \\ . \\ . \\ a_{1}A_{m}^{1} + \cdots + a_{n}A_{m}^{n}) \end{pmatrix} $$.

If all this index are correctly placed and my speach it's true could you explain me in detail how it works ? Maybe with one appropriate example.

Theory is also good but the problem could be easier understood with one example.

Any tip to approaching this kind of problem and deal with that is also good, thanks a lots!

Best Answer

Along the proof we will use the following notation, $\begin{pmatrix}\mu_1\\\cdots\\\mu_r\end{pmatrix}_B:=\lambda_1b_1+\dots+\lambda_rb_n$, being $B:=\{b_1,\dots,b_r\}$ a basis of some vector space $E$ over $\mathbb{K}$ and $\mu_i\in\mathbb{K}$ for all $i\in\{1,\dots,r\}$.

Given a linear transformation $f$ between two vector spaces $E_1$ and $E_2$ (both of them over the same field $\mathbb{K}$), and given two basis $B_1:=\{e_1,\dots,e_n\}\subset E_1$ and $B_2:=\{e_1',\dots,e_m'\}\subset E_2$.

First of all we will see that $f$ is completely determined by the images of the vectors of $B_1$, that is, if there is another linear transformation $g$ such that $g(e_i)=f(e_i)$ for all $i\in \{1,\dots,n\}$, then $f=g$. To prove this, we must remember that, given a vector $x\in E_1$, its coordinates with respect to $B_1$ are unique.

Here we go with the proof, lets suppose $x=\lambda_1e_1+\dots+\lambda_ne_n$, so \begin{equation} f(x)=f(\lambda_1e_1+\dots+\lambda_ne_n)=\lambda_1f(e_1)+\dots+\lambda_nf(e_n) \end{equation} So, if we have a linear transformation $g$ such that $g(e_i)=f(e_i)$ for all $i\in \{1,\dots,n\}$, then $f(x)=g(x)$ for all $x\in E_1$, in other words, $f=g$.

Lets go further, lets write $f(x)\in E_2$ in coordinates with respect to $B_2$, for this, lets suppose $f(e_i)=\alpha_{1i}e_1'+\dots+\alpha_{mi}e_m'$ for all $i\in\{1,\dots,n\}$. \begin{equation} \begin{array}{rc} f(x)=&\lambda_1f(e_1)+\dots+\lambda_nf(e_n)=\\ =&\lambda_1(\alpha_{11}e_1'+\dots+\alpha_{m1}e_m')+\dots+\lambda_n(\alpha_{1n}e_1'+\dots+\alpha_{mn}e_m')=\\ =&(\lambda_1\alpha_{11}e_1'+\dots+\lambda_1\alpha_{m1}e_m')+\dots+(\lambda_n\alpha_{1n}e_1'+\dots+\lambda_n\alpha_{mn}e_m')=\\ =&(\lambda_1\alpha_{11}+\dots+\lambda_n\alpha_{1n})e_1'+\dots+(\lambda_1\alpha_{m1}+\dots+\lambda_n\alpha_{mn})e_m' \end{array} \end{equation} We can write this with our notation as \begin{equation} f(x)=\begin{pmatrix}\lambda_1\alpha_{11}+\dots+\lambda_n\alpha_{1n}\\\cdots\\\lambda_1\alpha_{m1}+\dots+\lambda_n\alpha_{mn}\end{pmatrix}_{B_2}=\begin{pmatrix}\begin{pmatrix}\alpha_{11}& \cdots & \alpha_{1n}\\&\cdots&\\\alpha_{m1}&\cdots&\alpha_{mn}\end{pmatrix}\begin{pmatrix}\lambda_1\\\cdots\\\lambda_n\end{pmatrix}\end{pmatrix}_{B_2} \end{equation} Lets call $A:=\begin{pmatrix}\alpha_{11}& \cdots & \alpha_{1n}\\&\cdots&\\\alpha_{m1}&\cdots&\alpha_{mn}\end{pmatrix}$. Notice that the columns of this matrix are the coordinates with respect to $B_2$ of $f(e_i)$, which are unique. Lets also call $X:=\begin{pmatrix}\lambda_1\\\cdots\\\lambda_n\end{pmatrix}$.

So, to sum up we have found that $f(x)=\begin{pmatrix}AX\end{pmatrix}_{B_2}$ for all $x\in E_1$, if there is another way to do so, that is $f(x)=\begin{pmatrix}CX\end{pmatrix}_{B_2}$ for all $x\in E_1$, then $\begin{pmatrix}AX\end{pmatrix}_{B_2}=\begin{pmatrix}CX\end{pmatrix}_{B_2}$, and that is equivalent, by uniqueness of the coordinates to $AX=CX\iff AX-CX=0\iff (A-C)X=0$ for all $X$. Calling $D:=A-C$, lets prove $D=0$.

We know that $\begin{pmatrix}DX\end{pmatrix}_{B_2}=0$ for all $X$, so expanding the matrix product we get \begin{equation} \lambda_1\begin{pmatrix}d_{11}\\\cdots\\d_{m1}\end{pmatrix}_{B_2}+\cdots+\lambda_n\begin{pmatrix}d_{1n}\\\cdots\\d_{mn}\end{pmatrix}_{B_2}=0 \end{equation} which means that all the linear combinations of the vectors representing the columns of $D$ equals zero, and this can only happen if all of them are the zero vector. As long as all the unique coordinates of the zero vector in any basis are all zeroes, $D=0$, so $A-C=0$, then $A=C$, as we wanted to prove.

Related Question