[Math] Change of basis – is the dot product method correct

linear algebra

The traditional way of expressing a vector in a differnet basis relies on change of basis matrix (see here)

I'm not sure why almost no book or online texts mention this, but there's another method. Is it correct? Suppose I have a vector expressed in standard basis $x=(x_1,x_2,x_3,…,x_n)$ and a set of basis vectors $B'=\lbrace b'_1, b'_2,…, b'_n\rbrace$. Then the new coordinates are:

$(x'_1, x'_2,…,x'_n)=(b'_1\cdot x, b'_2\cdot x,…,b'_n\cdot x)$.

So every new coordinate is a dot product of the original vector and the respective basis vectors of the new basis. This is used for example in this paper, see page 3, upper right part

How can we derive this method from the traditional one (change of matrix-based)? Intuitively, the method makes sense – we have a bunch of vectors forming a basis in our space. We take vectors in the old basis and project them onto the new basis vectors (that's the dot product). The respective results are the coordinates in the new basis.

However, I'm not that sure if this is correct. Let
$B=\lbrace b_1, b_2,…, b_n\rbrace$ be the old basis,
$B'=\lbrace b'_1, b'_2,…, b'_n\rbrace$ be the new basis. We have a vector $x$ whose coordinates in basis $B$ are $[x]_B=(x_1,x_2,x_3,…,x_n)$.

It means that $x=x_1 b_1 + x_2 b_2 + …+x_n b_n$.

To simplify, consider the $\mathbb{R^3}$ example.

$[x]_B=(x_1,x_2,x_3), B=\lbrace b_1, b_2, b_3\rbrace, B'=\lbrace b'_1, b'_2, b'_3 \rbrace$.

$x=a_1 b_1 + a_2 b_2 + a_3 b_3$

$b_1=\alpha_1 b'_1 + \alpha_2 b'_2 + \alpha_3 b'_3$

$b_2=\alpha_4 b'_1 + \alpha_5 b'_2 + \alpha_6 b'_3$

$b_3=\alpha_7 b'_1 + \alpha_8 b'_2 + \alpha_9 b'_3$

Substituting, we get that $x=(a_1 \alpha_1 + a_2 \alpha_4 + a_3 \alpha_7)b'_1 + (a_1\alpha_2 + a_2 \alpha_5 + a_3 \alpha_8)b'_2 + (a_1\alpha_3 + a_2 \alpha_6 + a_3 \alpha_9)b'_3$.

So the coords of $x$ in the new basis $B'$ are:

$[x]_{B'}=\left[\begin{array}{ccc}\alpha_1&\alpha_4&\alpha_7\\\alpha_2&\alpha_5&\alpha_8\\\alpha_3&\alpha_6&\alpha_9\end{array}\right]$$\left[\begin{array}{c}a_1\\a_2\\a_3\end{array}\right]$

Notice above that the vectors in basis $B$ were put in COLUMNS, not ROWS! Then it has nothing to do with dot product of the original vector $x$ with the vectors of the new basis. So how to get this dot product method from change of basis matrix?

I've tested a few examples and they worked (only if the basis vectors were normalized, of course)…

Best Answer

EDIT: In the following, I assumed the new basis was an orthogonal transformation of the old. This is evidently not the case in the document cited in the question, which describes its transformation as a "rotation and stretch." Various nice properties of the orthogonal transformation (such as "the columns of the matrix are the rows of its inverse") are not preserved then, so the discussion after the dividing line (below) only partly applies.

In the linked reference, in part B ("Change of Basis", on page 3) there is a statement that the matrix $P$ represents a "rotation and stretch", which seems to me to imply that the columns of $P$ are pairwise orthogonal but not necessarily orthonormal. At the end of that section it says the dot products of the rows of $P$ and columns of $X$ (in the matrix product $PX$) represent the projection of each column of $X$ onto rows of $P$. If there is any "stretch" then this is not the usual kind of projection, and the resulting column of $Y$ does not represent the coefficients of a linear combination of rows of $P$ that would equal the original column of $X$, and I question the statement that the rows of $P$ are a new set of basis vectors for expressing the columns of $X$. They do not seem to be so in the usual sense.


The dot product method is valid.

There are two obvious ways to convert coordinates under an orthogonal change of basis. One looks like a rotation of all vectors around the origin to arrive at new coordinates. The other looks like the system of coordinates itself rotated while all the vectors stayed in place.

An important property of these two rotations is that each is exactly opposite the other, that is, they are inverse transformations of each other. A simple example is if you have a piece of paper in front of you with writing on it, but the lines of writing are all tilted so they go from the upper left to lower right along a $45$-degree slope instead of just left to right. To make it so that you can see the text right-side up, you can either rotate the paper $45$ degrees counterclockwise, or you can tilt your head $45$ degrees clockwise--same amount of rotation around the same axis, but in opposite directions.

So if matrix $M$ would accomplish one of the methods of change of basis, you need matrix $M^{-1}$ to do the other.

Another interesting fact about rotations is that the the matrix $M$ that performs the rotation is orthogonal, and it has the property that $M^{-1} = M^T$. That is, the inverse of $M$ is just its transpose.

So that's how one matrix can put the basis vectors in columns and the other can put them in rows. One matrix corresponds to turning the paper, the other corresponds to tilting your head. They are inverses of each other, so the columns of one are the rows of the other.

Related Question