I think the space $W$ should be defined a partial order $\leq$ and zero element $0$ firstly and satisfy:
- If $a\leq b$ then $ca\leq cb$, $\forall a,b\in W$ and $0\leq c\in W$.
- If $a\leq b$ then $a-b\leq 0$.
Secondly, a multiply operator $\cdot$ should be defined in $W$ and satisfy $0\leq a\cdot a\stackrel{\triangle}{=}a^2$ for $\forall a\in W$. Also, the inverse operator of $\cdot$ should be defined in $W$ (Alternatively, the inverse element is defined in $W$). That is, if $ab=c$ then $c\stackrel{\triangle}{=}a/b$ for $\forall a,b,c\in W$ and $b\neq 0$ where $/$is the inverse operator of $\cdot$. What is more, these operators should be closed in $W$. Say, if $\forall a,b\in W$ then $a\cdot b\in W$ and $a/b\in W$ if $b\neq 0$. Finally, the operators $\cdot$ and /should satisfy commutative law.
Thirdly, there should have a multiply operator between the elements from $W$ and $V$ because we will define inner product by using this operator.
What is more, to hold the Cauchy-Schwarz inequality, the properties of inner product is important. I believe the Cauchy-Schwarz inequality is valid in a space which define a inner product whose definition is classical. In another word, if a space $V$ have been defined a inner product $(*,*)$ (say, a bilinear form that $V\times V\rightarrow W$) satisfy the following conditions:
- Commutative: $(x,y)=(y,x)$, $\forall x,y\in V$ (If V is a complex space, the right hand side should be dual. But for the sake of simplicity, we ignore it here.)
- Linearity: $(\alpha x+\beta y, z)=\alpha(x,z)+\beta(y,z)$, $\forall x,y,z\in V$ and $\alpha,\beta\in W$.
- Positive define: $(x,x)\geq0$, $\forall x\in V$. The equal sign is valid iff $x=0$ is valid where $0$ donate the zero element in $W$.
Then, by this definition, the Cauchy-Schwarz inequality is valid. The proof are as follow:
For $\forall\lambda\in W$ and $\forall x,y \in V$, we have:
\begin{equation}
0\leq (x+\lambda y,x+\lambda y)=(x,x)+2\lambda(x,y)+\lambda^2(y,y)
\end{equation}
If $y=0$, that is a trivial case and Cauchy-Schwarz inequality is valid obviously. If $y\neq 0$, let $\lambda=-(x,y)/(y,y)$ then we have:
\begin{equation}
0\leq(x,x)-2(x,y)^2/(y,y)+(x,y)^2/(y,y)^2(y,y)\\
(x,y)^2\leq (x,x)(y,y)
\end{equation}
This is the Cauchy-Schwarz inequality.
In fact, Cauchy-Schwarz inequality imply that the inner product of two elements is less than the their product of length because there is an angle between them. And $W$ is a space to measure the inner product of $V$. So I think the conditions I assume at start is reasonable.
It is not entirely clear what you mean by 'following Tao's argument'.
Suppose $|\langle v,w\rangle| = \|v\| \|w\|$.
Let $w=\alpha v +h$, with $h \bot v$. Then $\|w\|^2 = |\alpha|^2 \|v\|^2 + \|h\|^2$ and $|\langle v,w\rangle| = |\alpha| \|v\|^2$.
Then the first squared line gives:
$|\alpha|^2 \|v\|^4 = \|v\|^2 (|\alpha|^2 \|v\|^2 + \|h\|^2)$.
Hence either $v=0$ and we have $v = 0 \cdot w$, or $h=0$ in which case $w = \alpha v$.
Alternative approach:
If $w=0$ or $v=0$ the result is true so suppose both are not zero.
Suppose $|\langle v,w\rangle| = \|v\| \|w\|$. Replacing $v$ by $\theta v$, with $|\theta| = 1$ does not change the formula, so
we can assume that
$\langle v,w\rangle = \|v\| \|w\|$.
Note that replacing $(v,w)$ by $(tv, {1 \over t} w)$, with $t>0$ does not change the formula.
Now note that $(t\|v\|-{1 \over t}\|w\|)^2 = \|tv-{1 \over t}w\|^2$.
Now choose $t=\sqrt{\|w\| \over \|v\|}$ to get
$w = {\|w\| \over \|v\|} v$.
Best Answer
There is also an approach by "amplification" which is really cool. Also the exact same trick works to prove Hölder's inequality and is generally a very important principle for improving inequalities.
It goes like this: We start out with $$\langle a-b,a-b\rangle\ge 0$$ for $a,b$ in your inner product space, and $a\not=0$, $b\not=0$. This implies $$2\langle a,b\rangle\le \langle a, a\rangle + \langle b, b\rangle$$ Now notice that the left hand side is invariant under the scaling $a\mapsto \lambda a$, $b\mapsto \lambda^{-1}b$ for $\lambda>0$. This gives $$2\langle a,b\rangle \le \lambda^2 \langle a,a\rangle + \lambda^{-2}\langle b, b\rangle$$ Now look at the right hand side as a function of the real variable $\lambda$ and find the optimal value for $\lambda$ using calculus (set the derivative to $0$):
$$\lambda^2=\sqrt{\frac{\langle b,b\rangle}{\langle a,a\rangle}}$$
Plugging this value in, we obtain
$$2\langle a,b\rangle\le \sqrt{\langle a,a\rangle}\sqrt{\langle b,b\rangle}+\sqrt{\langle a,a\rangle}\sqrt{\langle b,b\rangle}$$
i.e.
$$\langle a,b\rangle\le\sqrt{\langle a,a\rangle}\sqrt{\langle b,b\rangle}$$
Notice how we took a trivial observation and "optimized" the expression by exploiting scaling invariance.