What you are doing wrong is assuming that you can apply the "product rule" and "chain rule" to matrix differentiation as you're thinking about it, as is stated in the article here.
There is a "product rule" and "chain rule" that work in this context. However, understanding them requires that you acknowledge that the derivative of $h(a)$ is not simply a scalar-valued function on matrices; rather, at each $a$, $h'(a)$ is a linear functional on matrices, which can be represented nicely as a matrix with the correct choice of dual basis.
In general when $f: \Bbb{R}^n \to \Bbb{R}$ is the euclidean norm, defined by
\begin{equation}
f(x) = |x| = \sqrt{\sum_{i=1}^n x_i^2}
\end{equation}
then away from the origin, $f$ is $C^{\infty}$ and its derivative is given by
\begin{equation}
f'(x)(\eta) = \left\langle \dfrac{x}{|x|}, \eta \right\rangle
\end{equation}
(for all $x \neq 0$, and for all $\eta \in \Bbb{R}^n$).
Now, for convenince, define $g: \Bbb{R} \times \Bbb{R}^n \to \Bbb{R}^n$,
\begin{equation}
g(\varepsilon,x) = \nabla u(x) + \varepsilon \nabla v(x)
\end{equation}
The energy functional's derivative you wish to calculate is given by
\begin{align}
J'(u)(v) &= \dfrac{d}{d \varepsilon} \bigg|_{\varepsilon = 0} J(u+ \varepsilon v) \\
&= \dfrac{d}{d \varepsilon} \bigg|_{\varepsilon = 0} \int_{ x \in\Omega} \dfrac{\left[ (f \circ g)(\varepsilon, x) \right]^p}{p} \\
&= \int_{ x \in\Omega} \dfrac{\partial}{\partial \varepsilon} \bigg|_{\varepsilon = 0} \dfrac{\left[ (f \circ g)(\varepsilon, x) \right]^p}{p} ,
\end{align}
This last equality is by Leibniz's integral rule for differentiation under the integral sign. Now, inside, we just have to make use of the chain rule inside. Doing so yields
\begin{align}
J'(u)(v) &= \int_{ x \in\Omega} \dfrac{p \cdot \left[ f(g(0,x)) \right]^{p-1}}{p} \cdot f'[g(0,x)] \left( \dfrac{\partial g}{\partial \varepsilon}(0,x) \right) \\
&= \int_{ x \in\Omega} |\nabla u(x)|^{p-1} \cdot \left \langle \dfrac{\nabla u(x)}{|\nabla u(x)|} , \nabla v(x) \right \rangle \\
&= \int_{x \in \Omega} |\nabla u(x)|^{p-2} \cdot \left \langle \nabla u(x), \nabla v(x) \right \rangle
\end{align}
Modulo some notation, this is precisely what you wanted to prove.
By the way, my entire answer assumes that all the functions are nice and differentiable, and in particular that if $ p< 2$, then $\nabla u(x)$ doesn't vanish anywhere, so that the norm is differentiable. If this isn't the case, then the argument will necessarily have to be more involved... I'm not too sure how one might proceed in this case though.
If $p\geq 2$, then we can write $|\nabla u|^p = \langle \nabla u, \nabla u\rangle^{p/2}$. In this case, the functions will be differentiable even regardless of whether or not $\nabla u$ vanishes; this is because the function $t \mapsto t^{p/2}$ is everywhere differentiable, and the inner product $\langle \cdot , \cdot \rangle$ is also everywhere differentiable, hence their composition will be as well. Of course, in this case, we will have to slightly modify the above argument, but the final answer is the same
Best Answer
The key step seems to be an initial application of the Leibniz integral rule. We have $$ \frac{d}{ds} \int_{\Omega}|\nabla (u(x) + sv(x))|^2 dx = \\ \int_{\Omega}\frac{d}{ds} |\nabla u(x) + s \nabla v(x)|^2 dx =\\ \int_{\Omega}\frac{d}{ds} (|\nabla u(x)|^2 + s^2|\nabla v(x)|^2 + 2s\nabla u(x)^T \nabla v(x)) dx =\\ 2 \int_\Omega [s|\nabla v(x)|^2 + \nabla u(x)^T \nabla v(x)]dx. $$ Evaluating this at $s = 0$ gives you the desired result.