Prove that $A^T \cdot A$ equal $I$ for an orthonormal matrix $A$ directly using matrix multiplication

linear algebramatricesorthogonal matricesorthogonalityorthonormal

Apparently it is always true that $A \cdot A^T=A^T \cdot A=I$ for an orthonormal matrix $A$. I know this can be proven using theorems regarding the rank and invertibility of matrices, but I would like to show it directly with matrix multiplication.

Suppose that
$A =
\begin{bmatrix}
u_1 &v _1 \\
u_2 &v_2 \\
\end{bmatrix}
$
is an orthonormal matrix.
Then we know that $\begin{bmatrix} u_1\\u_2 \end{bmatrix}^T \begin{bmatrix} u_1\\u_2 \end{bmatrix}=1$
and $\begin{bmatrix} v_1\\v_2 \end{bmatrix}^T \begin{bmatrix} v_1\\v_2 \end{bmatrix}=1$, and also that $\begin{bmatrix} u_1\\u_2 \end{bmatrix}^T \begin{bmatrix} v_1\\v_2 \end{bmatrix}=0$

However, we must have that $A \cdot A^T = \begin{bmatrix}
u_1^2+v_1^2 & u_1u_2+v_1v_2 \\
u_1u_2+v_1v_2 & u_2^2+v_2^2 \\
\end{bmatrix}
$
, and I do not see why $u_1^2+v_1^2 =1$ or why $u_1u_2+v_1v_2=0.$ It seems like $A \cdot A^T = I$ is not necessarily true.

Where is my mistake?

Best Answer

I think that the answers/responses given do not exactly answer the question asked. The question seems to have been asked specifically to try to build a concrete understanding of why "orthonormal columns implies orthonormal rows". While this desire probably comes from noble intentions, it is also likely built upon the assumption that the result is obvious-- even without using the powerful tools of linear algebra.

However, even in the 2x2 case (as originally asked) if we tie our hands behind our backs and restrict ourselves only to tools which came before linear algebra it turns out to be a rather complicated thing to show. But, I'll include an uglier and weaker answer than those already provided, to emphasize the value of learning linear algebra and how non-trivial the results from linear algebra are. Even "simple" ones.

To answer the question:

Fix orthonormal vectors $u,v \in \mathbb{R}^{2}$. In particular, $|u| = |v| = 1$. So, we know that there exists $\theta, \alpha$ so that $u = (\cos(\theta), \sin(\theta))$ and $v= (\cos(\alpha), \sin(\alpha))$. Moreover, since $u \cdot v = 0$, without loss of generality (by replacing $u$ with $-u$ if necessary) we also know that $\theta = \alpha \pm \pi/2$.

Using the identity $\cos(x \pm \pi/2) = \mp \sin(x)$ we deduce that $u_{1}^{2} + v_{1}^{2} = (\mp \sin(\alpha))^{2} + \cos^{2}(\alpha) = 1$ as desired. One can show identically that $u_{2}^{2} + v_{2}^{2} =1$.

Since $\theta = \alpha \pm \pi/2$ we have $\alpha = \theta \mp \pi/2$. So, $\cos(\alpha) = (\pm \sin(\theta))$. Hence, \begin{align*} u_{1} u_{2} + v_{1} v_{2} &= \cos(\theta) \sin(\theta) + \cos(\alpha) \sin(\alpha) \\ & = (\mp \sin(\alpha)) \sin(\theta) + \cos(\alpha) \sin(\alpha) \\ & = \sin(\alpha) \left( \cos(\alpha) \mp \sin(\theta) \right) \\ & = 0, \end{align*} painfully verifying that $A A^{T} =I$.

Good luck generalizing this to any higher-dimensional cases.

Related Question