Quaternions – Motivation for Construction of Cross-Product

cross productmotivationquaternions

I'm trying to present a narrative that brings the (3D) Cross Product into existence.

"Given two vectors $\mathbf u$, $\mathbf v$, how to construct a vector perpendicular to both?"

… looks like a good motivating problem.

So, given two vectors u, v we must create a third vector w s.t. $\mathbf w \cdot \mathbf u = \mathbf w \cdot \mathbf v = 0$. This gives:

$$\mathbf w = \lambda (u_2 v_3 – u_3 v_2, – u_1 v_3 + u_3 v_1, u_1 v_2 – u_2 v_1)$$

… and, taking $\lambda=1$, then to establish the Cross Product Rule we would need to show that $|\mathbf w|$ gives the (signed) area of the parallelogram formed by $u$ and $v$.

But this approach seems to require several lines of moderately dense algebra.

I tried looking for a geometrical approach and found a handful of links:

Cross product as result of projections
Explanation of a cross product result

However, I'm struggling to see it. (EDIT: nearly there: Geometric understanding of the Cross Product)

Another potentially interesting approach is Quaternions. I found a very interesting article here: http://www.johndcook.com/blog/2012/02/15/dot-cross-and-quaternion-products/

The author observes that by defining i,j,k s.t. $i^2=j^2=k^2=ijk=-1$, multiplying two pure quaternions $\mathbf u=0+bi+cj+dk$ and $\mathbf v=0+fi+gj+hk$ gives:

$$\mathbf u \mathbf v = -(\mathbf u \cdot \mathbf v) + \mathbf u \times \mathbf v$$

So my question is: On account of this succinctness, should the Quaternion approach be considered the origin of the Cross Product? Can anyone present an elegant narrative that brings the Cross Product into existence?

PS Links to investigate:
http://math.oregonstate.edu/bridge/papers/dot+cross.pdf
https://en.wikiversity.org/wiki/Cross_product
https://www.physicsforums.com/threads/explanation-of-the-cross-product.513287/
http://behindtheguesses.blogspot.co.uk/2009/04/dot-and-cross-products.html

Best Answer

Before meeting the cross product, students will have already met the dot product. This is a form of multiplication that takes two vectors and gives you a scalar. It's natural to wonder whether there's also some kind of multiplication that takes two vectors and gives you another vector.

Obviously we could just write down any old function and call it "the cross product". But in order for it to actually be a nice form of multiplication there are a few properties that we would expect it to have:

$$\begin{align*} (\lambda\mathbf u) \times \mathbf v=\lambda(\mathbf u \times \mathbf v)=\mathbf u \times (\lambda\mathbf v)\\ \mathbf u \times (\mathbf v+\mathbf w)=\mathbf u \times \mathbf v+\mathbf u \times \mathbf w\\ (\mathbf u + \mathbf v)\times\mathbf w=\mathbf u \times \mathbf w+\mathbf v \times \mathbf w \end{align*}$$

(these properties are called "being bilinear").

There's one more property that we require in order for the cross product to make sense geometrically, which is that it shouldn't depend on which way you are looking at your problem. If we rotate our vectors and then take their cross product we should get the same answer as if we take their cross product and then rotate. Otherwise people looking at the same problem from different angles would get different answers! In symbols we represent the rotation by some orthogonal matrix $M$ with determinant $1$, and say that for each such matrix we want the following property:

$$M(\mathbf u\times\mathbf v)=(M\mathbf u)\times(M\mathbf v)$$

(This is called "invariance" or sometimes "covariance". Notice that the dot product also makes geometric sense in this way. If you rotate two vectors and then take their dot product you get the same answer as you would have gotten before the rotation. Or in other words $(M\mathbf u)\cdot(M\mathbf v)=u\cdot v$.)

Now here's the clever bit: I claim that the cross product is the only function with these properties. This explains why the cross product is interesting: it's the only form of multiplication that makes any sense at all! (Actually there are also the functions like $\lambda\mathbf u\times\mathbf v$ that are just scalings of the cross product by some factor, but each of these functions can be written in terms of any of the others and so we can just pick the usual cross product to be our favourite, and work in terms of that.)


Proof

I'll show that if we have a function, $\times$, that is bilinear and invariant (i.e. it obeys the four equations listed above) then it is in fact the cross product. We'll work in terms of the usual basis vectors $\mathbf i$, $\mathbf j$ and $\mathbf k$.

First we'll try to work out what $\mathbf i \times \mathbf j$ is. Let $M$ be the rotation of $180^\circ$ about the $k$-axis. Then $M\mathbf i=-\mathbf i$ and $M\mathbf j=-\mathbf j$. So we have

$$M(\mathbf i \times \mathbf j)=(M\mathbf i)\times(M\mathbf j)=(-\mathbf i)\times(-\mathbf j)=\mathbf i\times\mathbf j$$

(the last step used bilinearity to move the minus signs out so they could cancel). This means that $\mathbf i \times \mathbf j$ is fixed by the rotation $M$, and so must lie on the $\mathbf k$-axis. As I said before, we're going to allow ourselves to pick our favourite scaling, so since we know $\mathbf i \times \mathbf j$ is on the $\mathbf k$-axis we might as well assume that $\mathbf i \times \mathbf j=\mathbf k$.

There's a rotation (of $120^\circ$ about $\mathbf i+\mathbf j+\mathbf k$) that takes $\mathbf i$ to $\mathbf j$, $\mathbf j$ to $\mathbf k$, and $\mathbf k$ to $\mathbf i$. Applying invariance under this matrix to our equation $\mathbf i \times \mathbf j=\mathbf k$ gives us $\mathbf j \times \mathbf k=\mathbf i$. Applying it again gives $\mathbf k \times \mathbf i=\mathbf j$.

There's also a rotation (of $180^\circ$ about $\mathbf i+\mathbf j$) that takes $\mathbf i$ to $\mathbf j$, $\mathbf j$ to $\mathbf i$, and $\mathbf k$ to $-\mathbf k$. Applying invariance under this matrix to our equation $\mathbf i \times \mathbf j=\mathbf k$ gives us $\mathbf j \times \mathbf i=-\mathbf k$. Similarly we have $\mathbf k \times \mathbf j=-\mathbf i$ and $\mathbf i \times \mathbf k=-\mathbf j$.

Finally we want to know what $\mathbf i \times \mathbf i$ is. Let $M$ be the $180^\circ$ rotation about the $\mathbf k$-axis, as before. Then

$$M(\mathbf i \times \mathbf i)=(M\mathbf i)\times(M\mathbf i)=(-\mathbf i)\times(-\mathbf i)=\mathbf i\times\mathbf i$$

so $\mathbf i \times \mathbf i$ is fixed by $M$ and therefore lies on the $\mathbf k$-axis. But the same argument applied with a rotation about the $\mathbf j$-axis shows that $\mathbf i \times \mathbf i$ lies on the $\mathbf j$-axis too! These two axes only intersect at $\mathbf 0$. So $\mathbf i \times \mathbf i=\mathbf 0$ and by the same argument $\mathbf j \times \mathbf j=\mathbf 0$ and $\mathbf k \times \mathbf k=\mathbf 0$.

Now since we know how to cross product any two basis vectors we can calculate the cross product of any two vectors by multiplying out (using bilinearity):

$$\begin{align*}(u_i\mathbf i+u_j\mathbf j+u_k\mathbf k)\times(v_i\mathbf i+v_j\mathbf j+v_k\mathbf k)= &u_iv_i\mathbf i\times\mathbf i+u_iv_j\mathbf i\times\mathbf j+u_iv_k\mathbf i\times\mathbf k\\ +&u_jv_i\mathbf j\times\mathbf i+u_jv_j\mathbf j\times\mathbf j+u_jv_k\mathbf j\times\mathbf k\\ +&u_kv_i\mathbf k\times\mathbf i+u_kv_j\mathbf k\times\mathbf j+u_kv_k\mathbf k\times\mathbf k\\ =(u_jv_k-u_kv_j)\mathbf i+(u_kv_i-u_iv_k)&\mathbf j+(u_iv_j-u_jv_i)\mathbf k \end{align*}$$

This is the formula for the cross product.


The above was a rewrite of my original answer which said more or less the same thing as above but in more formal terms. I'll put my original answer here because I think some people reading this might like to see the technical details:

Given a $3$-dimensional oriented real inner-product space $V$ the group of symmetries preserving the inner-product and orientation is $\mathrm{SO}(V)$. The invariant tensors under $\mathrm{SO}(V)$ are $\delta_{ij}$, $\delta^{ij}$, and $\varepsilon_{ijk}$, along with the things they generate like $\delta^{ij}\varepsilon_{klm}$ and so on.

The tensors $\delta_{ij}$ and $\delta^{ij}$ are the inner-product and the inner-product induced on the dual space. These aren't very interesting because we defined $\mathrm{SO}(V)$ to preserve these, so we already knew that we were going to get them. But the tensor $\varepsilon_{ijk}$ is in some sense new. Therefore we are motivated to investigate $\varepsilon_{ijk}$ or equivalently the bilinear map $V\times V\rightarrow V$ given by $(v\times w)^i=\delta^{ij}\varepsilon_{jkl}v^kw^l$. This is the cross product.

Related Question