Interpret the change of basis matrix

change-of-basismatricesrotations

Given a vector v and a basis A, where its coordinates are (x,y), in order to find v's coordinates in a new basis A' (i.e. x' and y') which is a rotation of A by angle θ, it is sometimes said that you must proceed as follows:

a) find the coordinates in the original basis A of the unit vectors of the new basis (say e’x and e’y), which happen to be (cosθ, sinθ) and (-sin θ, cos θ), respectively;

b) then x' in the new basis A' = dot product between v (coordinates in A = x and y) and e’x (also as per coordinates in A = xcosθ +ysenθ);

c) whereas y' = dot product between again v in A and e’y in A = x(-sinθ) +ycosθ.

In matrix notation, the coordinates in the original basis A of the unit vectors of the new basis form the following matrix:
$$\left( {\begin{array}{*{20}{c}}{\cos \theta }&{ – \sin \theta }\\{\sin \theta }&{\cos \theta }\end{array}} \right)
% MathType!MTEF!2!1!+-
% feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn
% hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr
% 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq-Jc9
% vqaqpepm0xbba9pwe9Q8fs0-yqaqpepae9pg0FirpepeKkFr0xfr-x
% fr-xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbiqaaG8bdaqada
% aeeG+aaaaaaivzKbWdbeaafaqabeGacaaabaGaci4yaiaac+gacaGG
% ZbGaeqiUdehabaGaeyOeI0Iaci4CaiaacMgacaGGUbGaeqiUdehaba
% Gaci4CaiaacMgacaGGUbGaeqiUdehabaGaci4yaiaac+gacaGGZbGa
% eqiUdehaaaWdaiaawIcacaGLPaaaaaa!4DB5!
$$

But actually that is not the transformation matrix from A to A’, but from A’ to A. To do the conversion that we were interested in (from A to A’), we need the inverse matrix, which is as follows:
$$\left( \begin{array}{l}x'\\y'\end{array} \right) = \left( {\begin{array}{*{20}{c}}{\cos \theta }&{\sin \theta }\\{ – \sin \theta }&{\cos \theta }\end{array}} \right)\left( \begin{array}{l}x\\y\end{array} \right)
% MathType!MTEF!2!1!+-
% feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn
% hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr
% 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq-Jc9
% vqaqpepm0xbba9pwe9Q8fs0-yqaqpepae9pg0FirpepeKkFr0xfr-x
% fr-xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbiqaaG8bdaqada
% abaeqaqqaaaaaaOpGqSvxza8qabaGaamiEaiaacEcaaeaacaWG5bGa
% ai4jaaaapaGaayjkaiaawMcaaiabg2da9maabmaapeqaauaabeqaci
% aaaaba1haaauFapiqaaiGacogacaGGVbGaai4CaiabeI7aXbqaaiGa
% cohacaGGPbGaaiOBaiabeI7aXbWdbeaacqGHsislciGGZbGaaiyAai
% aac6gacqaH4oqCaeaaciGGJbGaai4BaiaacohacqaH4oqCaaaapaGa
% ayjkaiaawMcaamaabmaaeaqababbOpaaaaaasvgza8WabaGaamiEaa
% qaaiaadMhaaaWdaiaawIcacaGLPaaaaaa!5C0B!
$$

How to interpret this latter matrix? My impression is that first row is the X’ values (in A’) of the unit vectors ex and ey of A and second row is the Y' values of those same vectors. This picture tries to express the idea:

Change of basis

If this were true, I would like this approach more, because it is more pedagogical: it is like saying that this matrix represents how basis A’ (its unit vectors) “sees” A basis (its unit vectors) and so it provides the correcting lens to read A values into A' terms, with the peculiarity that there is a different lens to see in each (let us call it like this, although this may not be the technical term) “dimension”, one for X’ and one for Y’, and each lens is the addition of what you see with two sub-lenses, so there is one X’ lens (cosθ) to read the X’ value of x and another X’ lens (sinθ) to read the X’ value of y; similarly, there is one Y’ sub-lens to read the Y’ value of x (-sinθ) and another Y’ lens (cosθ) to read the Y’ value of y.

I wonder: first, if I misunderstood anything and second, whether this can be generalized or things just fit in by chance in this particular example.

Best Answer

The description you give in (a)-(c) of the process for computing the coordinates of $\mathbf{v}$ in the new basis $A^\prime$ is correct. To see why this process is implemented using the inverse of the rotation matrix instead of the rotation matrix itself, you just need to think about the definition of matrix multiplication. If the matrix $M$ has rows $\mathbf{b}_1$ and $\mathbf{b}_2$, then

$$M\begin{pmatrix}x\\y\end{pmatrix}=\begin{pmatrix}\mathbf{v}_1\cdot x\\\mathbf{v}_2\cdot y\end{pmatrix}.$$

Using your description (and the notation) in (a)-(c), this means that the matrix for changing $A$-coordinates to $A^\prime$-coordinates must have rows $\mathbf{e}_x^\prime$ and $\mathbf{e}_y^\prime$. The rotation matrix itself has these vectors as its columns, whereas the inverse of the rotation matrix has these vectors as its rows. So you want the inverse of the rotation matrix.

The validity of your procedure depends crucially on the fact that the bases $A$ and $A^\prime$ are orthogonal, i.e., that they consist of mutually orthogonal vectors (you are actually using orthonormal bases, i.e., bases consisting of mutually orthogonal unit vectors, but the fact that the vectors have norm $1$ is not so important). So while your procedure works for any two orthonormal bases (and with a minor modification, for any two orthogonal bases) of an $n$-dimensional inner product space, things will not be so simple in general. The key point is that for an orthonormal basis $\{\mathbf{v}_1,\ldots,\mathbf{v}_n\}$ of an $n$-dimensional inner product space $V$, the coordinates of a vector $\mathbf{v}$ with respect to the basis are given by the dot products $\mathbf{v}_1\cdot\mathbf{v},\ldots,\mathbf{v}_n\cdot\mathbf{v}$. A general abstract vector space does not have an inner product, so the procedure cannot possibly generalize in a completely straightforward manner to this broader context. It is possible to write down formulas in the general setting, but they are of little practical value.