Proving the dot product cosine identity for dimensions $> 2$

geometryinner-productslinear algebratrigonometry

In $\mathbb{R}^n$, define the dot product $u \cdot v := \sum_{i = 1}^n u_i v_i$. I understand two proofs that
$$u \cdot v = \|u\| \; \|v\| \cos \theta$$
for $n = 2$, where $\|u\| := \sqrt{u \cdot u}$ is the Euclidean norm and $\theta$ is the counterclockwise rotation from $u$ to $v$ (or clockwise rotation, as $\cos(x) = \cos(2 \pi – x)$), but I don't know how to generalize them to $n > 2$.

The first proof is that the identity holds for $u = (1, 0)$ and $v = (\cos \theta, \sin \theta)$ and is invariant under rotation and scaling in $\mathbb{R}^2$. One author says "by an appropriate choice of coordinates we may assume we are working in 2 dimensions," and the other concludes "For higher dimensions, just notice that the two vectors $a$, $b$ span a two-dimensional subspace, for which the argument above applies."

The second proof is to write the vectors in polar form and apply $\cos(x – y) = \cos x \cos y + \sin x \sin y$.

Is it possible to adapt these proofs to higher dimensions? More strongly, if I have any proof of this identity for $n = 2$, is that sufficient to prove it for $n > 2$ due to some $2$-dimensional subspace argument?

I'm starting to believe that the only elegant way to prove this formula for all $n$ is with the law of cosines. It is somewhat unfortunate that Wikipedia uses this formula to prove the law of cosines… (Don't worry, the page has other proofs.) Bonus points if you have a good name for this formula – "dot product cosine identity" is the best I could do.

Best Answer

Here is another elegant proof in $\mathbb{R}^2$:

Let $\mathbf{u}$ and $\mathbf{v}$ be given. Choose $a$ such that $\hat{\mathbf{v}}=a\mathbf{u}$ forms a right triangle with $\mathbf{v}$ and $\mathbf{0}$ as shown in Figure 1. For simplicity, assume $a\gt0$ and thus $\hat{\mathbf{v}}$ makes the same angle $\theta$ with $\mathbf{v}$ (a nice exercise is to show the proof holds when $\theta$ is obtuse).

enter image description here $\tag{Fig. 1}$

By definition $$\tag{1}\cos\theta = \frac{|\hat{\mathbf{v}} |}{|\mathbf{v}|}.$$ Since $a\mathbf{u}=(au_1,au_2)$ and $\mathbf{z}=(z_1,z_2)$ are perpendicular their slopes multiply to $-1$: $$\frac{au_2}{au_1}\frac{z_2}{z_1}=-1\implies au_1z_1+au_2z_2=0$$ and so $$\hat{\mathbf{v}}\cdot \mathbf{z}=\hat{\mathbf{v}} \cdot (\hat{\mathbf{v}}-\mathbf{v})=0$$ $$\implies \hat{\mathbf{v}}\cdot \hat{\mathbf{v}}- \hat{\mathbf{v}}\cdot \mathbf{v}=0$$ $$\tag{2}\implies \hat{\mathbf{v}}\cdot \mathbf{v} =|\hat{\mathbf{v}}|^2. $$ Putting everything together $$a(\mathbf{u}\cdot \mathbf{v})=\hat{\mathbf{v}}\cdot \mathbf{v}=|\hat{\mathbf{v}}||\mathbf{v}|\cos\theta=a(|\mathbf{u}||\mathbf{v}|\cos\theta)$$ $$\tag{3}\implies \mathbf{u}\cdot \mathbf{v}=|\mathbf{u}||\mathbf{v}|\cos\theta. \qquad \qquad \square$$

Discussion: Going back over the proof, in particular Figure 1, we can see that the decisive step was the decomposition of the vector $\mathbf{v}$ into the sum of two vectors $$\tag{4} \mathbf{v}= \hat{\mathbf{v}} +(-\mathbf{z})=\hat{\mathbf{v}} +\mathbf{z}',$$ one a multiple of $\mathbf{u}$ and the other orthogonal to it. Alternatively, we could say that the cosine law for dot products fell out of the orthogonal projection of $\mathbf{v}$ onto the line $L$ (the $1$-dimensional subspace of $\mathbb{R}^2$ spanned by $\mathbf{u}$). In fact, the vector $\hat{\mathbf{v}}$ is called the orthogonal projection of $\mathbf{v}$ onto $\mathbf{u}$ and, as our proof shows, $$\tag{5} \hat{\mathbf{v}}=\mathbb{proj}_L\mathbf{v}=\frac{\mathbf{v}\cdot\mathbf{u}}{\mathbf{u} \cdot\mathbf{u}}\mathbf{u}.$$ Orthogonal projections naturally generalise to higher dimensions and lead to the orthogonal decomposition theorem, which states that, if $\{\mathbf{u_1},\mathbf{u_2},...,\mathbf{u_p}\}$ is any orthogonal basis of a subspace $W \subset\mathbb{R}^n$, then $$\tag{6} \hat{\mathbf{v}}=\frac{\mathbf{v}\cdot\mathbf{u_1}}{\mathbf{u_1}\cdot\mathbf{u_1}}\mathbf{u_1}+...+\frac{\mathbf{v}\cdot\mathbf{u_p}}{\mathbf{u_p}\cdot\mathbf{u_p}}\mathbf{u_p}.$$ Geometrically, this theorem states that the projection $\mathbb{proj}_W\mathbf{v}$ is the sum of $1$-dimensional projections onto the basis vectors (which coordinitise $W$). Figure 2 demonstrates the case for $\mathbb{R}^3$.

enter image description here $\tag{Fig. 2}$

So, with the above pictures in mind, we can outline the general proof in $\mathbb{R}^n$. Let $\mathbf{u}$ and $\mathbf{v}$ be any two linearly independent vectors in $\mathbb{R}^n$, then they span a plane $\pi$. Using the properties of orthogonal projections outlined above, the Gram-Schmidt process allows us to construct an orthonormal basis $\{\mathbf{b_1}, \mathbf{b_2}\}$ of $\pi$. Now, it easy to define (exercise) a linear transformation $T$, mapping $\{\mathbf{b_1}, \mathbf{b_2}\}$ to the standard basis $\{\mathbf{e_1}, \mathbf{e_2}\}$ of the plane $\pi'$ consisting of all n-tuples $\mathbf{x}=(x_1,x_2,0,...,0)$. The map $T$ is an isometry (exercise)-that is, $T$ is a distance preserving map: $$\tag{7} | \mathbf{u}-\mathbf{v}|= | T(\mathbf{u})-T(\mathbf{v})|.$$ Since $T$ is an isometry, it also preserves dot products. Clearly, in the plane $\pi'$, $$\tag{8} T(\mathbf{u})\cdot T(\mathbf{v})=|T(\mathbf{u})||T(\mathbf{v})|\cos\theta,$$ which we can prove by another orthogonal projection! Since $T$ is an isometry, the result must also hold in the plane $\pi$. $\qquad \qquad \square$

A Note on Isometries: A function $h:\mathbb{R}^n\rightarrow\mathbb{R}^n$ is an isometry iff it equals an orthogonal transformation followed by a translation: $$\tag{9} h(\mathbf{x})=A\mathbf{x} +\mathbf{p}.$$ There is however another, more geometrically appealing way to view isometries: as reflections. A famous result states that every isometry of $\mathbb{R}^n$ is a composition of at most $n+1$ reflections in hyperplanes. This can be visualised most evocatively in $\mathbb{R}^2$, where it is known as the three reflections theorem. A nice exercise is to show, via a drawing, that $$\tag{10} \mathbb{refl}_L \mathbf{v}=2\mathbb{proj}_L \mathbf{v}-\mathbf{v}.$$

Edit: For the case $\theta$ is obtuse, we get the following picture: enter image description here