Looking for intuition behind this proof of Cauchy-Schwarz inequality on $\mathbb{C}$

cauchy-schwarz-inequalityinner-productslinear algebra

The Cauchy Schwartz inequality states that: $\forall x,y\in V$ ($V$ is an inner product space over $\mathbb{C}$), we have $|\langle x,y\rangle| \leq \|x\| \|y\|$. Moreover, equality occurs iff $x = \lambda y$ for some $\lambda \in \mathbb{C}$ (i.e. they are linearly dependent).

The Proof:

We don't have to prove anything if either $x=0$ or $y=0$ (or both). So let $x,y\in V\backslash\{0\}$. Now consider $z\in V$ such that $$z = x – \frac{\langle x,y \rangle}{\|y\|^2} y$$
where $\langle x,y \rangle$ denotes the inner product of $x$ and $y$. Observe that
$$\|z\|^2 = \|x\|^2 – \frac{|\langle x,y\rangle|^2}{\|y\|^2} \geq 0$$ since $\|z\|^2 \geq 0$ by definition. From this it follows that $|\langle x,y\rangle| \leq \|x\| \|y\|$ as desired.

Lack of Intuition:

Where does the $z$ come from? Why have we defined $z$ in this way? Was it really obvious that this is the way to go? I could really not foresee this proof even after seeing the definition of $z$, it came out of the blue. Could someone please help me with the intuition here?

Best Answer

Let $(V,\langle\cdot,\cdot\rangle)$ be an inner product space, complex if you like, and $\{u_j\}_{j\in J}, J \subseteq\mathbb N \cup \{\infty\}$ be an orthonormal system, (not necessarly a (Schauder) basis), then for any vector $x \in V$ the Bessel' inequality is $$ \|x\|^2 \ge \sum_{j\in J} |\langle x, u_j\rangle|^2 \tag{*} $$

If the system $\{u_j\}_{j\in J}$ has only one element $u$, then $$ \|x\| \ge |\langle x, u\rangle| $$ now multiply $u$ by $\lambda > 0$ to get another vector $y = \lambda u \ne 0$, and simultaneously the LHS of the latter inequality we get $$ \lambda\|x\| = \|y\|\|x\| \ge |\langle x, \lambda u\rangle| = |\langle x, y\rangle| $$ since $\lambda$ is the norm of $\lambda u = y$.


The sum on the RHS of $(*)$ is the norm of $x_\parallel = \displaystyle \sum_{j\in J} \langle x, u_j\rangle u_j$, which is the part of $x$ parallel to the subspace spanned by the system. The rest is $x_\perp = x - x_\parallel$, which similar to $z$ in your question, you can check that it is orthogonal to $x_\parallel$, i.e. $\langle x_\perp, x_\parallel \rangle = 0$.

Related Question