Proof review: Symmetric matrices have real eigenvalues

eigenvalues-eigenvectorslinear algebraproof-explanation

This document provides the following proof:

The Spectral Theorem states that if $A$ is an $n \times n$ symmetric matrix with real entries, then it has $n$ orthogonal eigenvectors. The first step of the proof is to show that all the roots of the characteristic polynomial of $A$ (i.e. the eigenvalues of $A$) are real numbers.

Recall that if $z = a + bi$ is a complex number, its complex conjugate is defined by $\bar{z} = a − bi$. We have $z \bar{z} = (a + bi)(a − bi) = a^2 + b^2$, so $z\bar{z}$ is always a nonnegative real number (and equals $0$ only when $z = 0$). It is also true that if $w$, $z$ are complex numbers, then $\overline{wz} = \bar{w}\bar{z}$.

Let $\mathbf{v}$ be a vector whose entries are allowed to be complex. It is no longer true that $\mathbf{v} \cdot \mathbf{v} \ge 0$ with equality only when $\mathbf{v} = \mathbf{0}$. For example,

$$\begin{bmatrix} 1 \\ i \end{bmatrix} \cdot \begin{bmatrix} 1 \\ i \end{bmatrix} = 1 + i^2 = 0$$

However, if $\bar{\mathbf{v}}$ is the complex conjugate of $\mathbf{v}$, it is true that $\mathbf{v} \cdot \mathbf{v} \ge 0$ with equality only when $\mathbf{v} = 0$. Indeed,

$$\begin{bmatrix} a_1 – b_1 i \\ a_2 – b_2 i \\ \dots \\ a_n – b_n i \end{bmatrix} \cdot \begin{bmatrix} a_1 + b_1 i \\ a_2 + b_2 i \\ \dots \\ a_n + b_n i \end{bmatrix} = (a_1^2 + b_1^2) + (a_2^2 + b_2^2) + \dots + (a_n^2 + b_n^2)$$

which is always nonnegative and equals zero only when all the entries $a_i$ and $b_i$ are zero.

With this in mind, suppose that $\lambda$ is a (possibly complex) eigenvalue of the real symmetric matrix $A$. Thus there is a nonzero vector $\mathbf{v}$, also with complex entries, such that $A\mathbf{v} = \lambda \mathbf{v}$. By taking the complex conjugate of both sides, and noting that $A = A$ since $A$ has real entries, we get $\overline{A\mathbf{v}} = \overline{\lambda \mathbf{v}} \Rightarrow A \overline{\mathbf{v}} = \overline{\lambda} \overline{\mathbf{v}}$. Then, using that $A^T = A$,

$$\overline{\mathbf{v}}^T A \mathbf{v} = \overline{\mathbf{v}}^T(A \mathbf{v}) = \overline{\mathbf{v}}^T(\lambda \mathbf{v}) = \lambda(\overline{\mathbf{v}} \cdot \mathbf{v}),$$

$$\overline{\mathbf{v}}^T A \mathbf{v} = (A \overline{\mathbf{v}})^T \mathbf{v} = (\overline{\lambda} \overline{\mathbf{v}})^T \mathbf{v} = \overline{\lambda}(\overline{\mathbf{v}} \cdot \mathbf{v}).$$

Since $\mathbf{v} \not= \mathbf{0}$,we have $\overline{\mathbf{v}} \cdot \mathbf{v} \not= 0$. Thus $\lambda = \overline{\lambda}$, which means $\lambda \in \mathbb{R}$.

How does the author get from $\overline{\mathbf{v}}^T(\lambda \mathbf{v})$ to $\lambda(\overline{\mathbf{v}} \cdot \mathbf{v})$ and from $(\overline{\lambda} \overline{\mathbf{v}})^T \mathbf{v}$ to $\overline{\lambda}(\overline{\mathbf{v}} \cdot \mathbf{v})$?

I would appreciate it if someone could please take the time to clarify this.

Best Answer

Apparently, the author defines $x\cdot y=x^Ty$, even when $x$ or $y$ are complex vectors. This is a bit different from the definition of the usual inner product $\langle x,y\rangle=\overline{y}^Tx$ (or $\langle x,y\rangle=\overline{x}^Ty$, depending on convention).

Thus $\overline{v}^T(\lambda v)=\lambda(\overline{v}^Tv)=\lambda(\overline{v}\cdot v)$ and $(\overline{\lambda}\overline{v}^T)v=\overline{\lambda}(\overline{v}^Tv)=\overline{\lambda}(\overline{v}\cdot v)$.

Related Question