I am going through the properties of the gradient, and in particular I try to proof why the gradient is pointing to the direction of the steepest ascent. Here is what I’ve done so far:
$$
\partial_vf(\mathbf{a}) = \nabla f(\mathbf{a}) \cdot \mathbf{v}
$$
where $\partial_vf(\mathbf{a})$ is the directional deriviative and $\nabla f(\mathbf{a})$ the gradien and $v$ a vector with $\|\mathbf{v}\| = 1$, for all quantities assuming that they exist and are of appropriate dimensions.
The steepest ascent (s.a.) would be the direction where the partial derivative is highest, thus:
\begin{align}
\text{s.a.} &= \underset{v}{\operatorname{argmax}} \partial_vf(\mathbf{a}) \\
&= \underset{v}{\operatorname{argmax}} \nabla f(\mathbf{a}) \cdot \mathbf{v} \\
&= \underset{v}{\operatorname{argmax}} \|\nabla f(\mathbf{a})\|\ \|\mathbf{v}\| \cos(\phi) \\
&= \underset{\phi}{\operatorname{argmax}} \|\nabla f(\mathbf{a})\|\ \cos(\phi)
\end{align}
$$
\underset{\phi}{\operatorname{argmax}}\ \cos(\phi) = 0 \iff \cos(\phi) = 1
$$
Now if I am not mistaken, I need to prove that:
$$
\phi = 0 \iff \mathbf{v} = \frac{\nabla f(\mathbf{a})}{\|\nabla f(\mathbf{a})\|}.
$$
$$
(\Longleftarrow):\
\mathbf{v} = \frac{\nabla f(\mathbf{a})}{\|\nabla f(\mathbf{a})\|} \implies \cos(\phi) = \frac{\mathbf{v} \cdot \nabla f(\mathbf{a})}{\|\mathbf{v}\| \ \|\nabla f(\mathbf{a})\|} = \frac{\|\nabla f(\mathbf{a})\|^2}{\|\nabla f(\mathbf{a})\|^2} = 1 \implies \phi = 0.
$$
With the second direction I am stuck:
$$
(\implies):\
\phi = 0 \implies \cos(\phi) = 1 = \frac{\mathbf{v}\cdot \nabla f(\mathbf{a})}{\|\mathbf{v}\| \ \|\nabla f(\mathbf{a})\|} \\
\implies \|\mathbf{v}\| \ \|\nabla f(\mathbf{a})\| = \mathbf{v}\cdot \nabla f(\mathbf{a})
$$
EDIT:
Since:
\begin{align}
\|\mathbf{v}\| = 1 \implies \|\mathbf{v}\| \ \|\nabla f(\mathbf{a})\| &= \mathbf{v}\cdot \nabla f(\mathbf{a}) \\
\|\nabla f(\mathbf{a})\|&= \mathbf{v}\cdot \nabla f(\mathbf{a})
\end{align}
clearly this is true if:
$$
\mathbf{v} = \frac{\nabla f(\mathbf{a})}{\|\nabla f(\mathbf{a})\|}
$$
then:
$$
\|\nabla f(\mathbf{a})\| = \frac{\nabla f(\mathbf{a})\cdot \nabla f(\mathbf{a})}{\|\nabla f(\mathbf{a})\|} = \frac{\|\nabla f(\mathbf{a})\|^2}{\|\nabla f(\mathbf{a})\|} = \|\nabla f(\mathbf{a})\|
$$
However, this solution is basically obtained by making an ansatz (an educated guess) and I am interested in a more general approach.
Best Answer
Let $g= \nabla f(\mathbf{a})$. We can assume that $g \ne 0.$ For $v$ with $||v|| = 1 $ we get (with Cauchy-Schwarz):
$$|\partial_vf(\mathbf{a}) \cdot v | \le ||g|| \cdot ||v|| = ||g||.$$
Hence
$$-||g|| \le \partial_vf(\mathbf{a}) \cdot v \le ||g||.$$
Now put $v_1= \frac{g}{||g||}$ and $v_2=-v_1.$ Then
$$\partial_{v_1}f(\mathbf{a})=||g||$$
and
$$\partial_{v_2}f(\mathbf{a})=-||g||.$$