(1) Note that $L_X Y = [X,Y]= \nabla_X Y - \nabla_Y X$. Moreover, on $\mathbb{R}^n$, the covariant derivative is given by $\nabla_X Y = (X(Y_1), \ldots, X(Y_n))$. If $\partial_i$ is the $i$-th coordinate vector field on $\mathbb{R}^n$, then
$$
\begin{align*}
L_{\nabla f}g(\partial_i,\partial_j) &= (\nabla f)(g(\partial_i,\partial_j))-g([\nabla f,\partial_i],\partial_j) - g(\partial_i, [\nabla f, \partial_j]) \\
&= (\nabla f)(\delta_{ij}) + g(\partial_i \nabla f,\partial_j) + g(\partial_i, \partial_j \nabla f) \\
&= 0 + \partial_i\partial_j f + \partial_j \partial_i f = 2\partial_i\partial_j f.
\end{align*}
$$
And this is equal to the $(i,j)$-th entry of the Hessian matrix of $f$ (apart from the factor 2 of course).
(2) For all smooth vector fields $X$, $Y$ we have
$$
\begin{align*}
L_{\nabla f}g(X,Y) &= (\nabla f)(g(X,Y)) - g([\nabla f,X], Y)- g(X,[\nabla f, Y])\\
&= g(\nabla_{\nabla f} X, Y) +g(X,\nabla_{\nabla f}Y) \\
&\qquad- g(\nabla_{\nabla f}X-\nabla_X \nabla f, Y)-g(X,\nabla_{\nabla f}Y- \nabla_Y\nabla f)\\
&= g(\nabla_X \nabla f, Y) + g(X,\nabla_Y\nabla f).
\end{align*}
$$
If we show that the two last terms are equal, then we are done. This follows from the definition of a gradient and the fact that the covariant derivative is torsion free (because we are working with the Levi-Civita connection).
$$
\begin{align*}
g(\nabla_X \nabla f,Y) -g(X,\nabla_Y\nabla f) &= X g(\nabla f,Y) - g(\nabla f, \nabla_X Y) \\
& \qquad- Y g(\nabla f,X) + g(\nabla f,\nabla_Y X) \\
&= X(Y(f)) - Y(X(f)) - [X,Y](f) = 0
\end{align*}
$$
Throughout, assume $M$ is connected and $n$-dimensional.
Note that a function $f$ of vanishing Hessian is uniquely determined by its value and derivative at a single point $f(p),d_pf$. This means that the space of "linear" functions has dimension at most $n+1$. However, it is often the case that there are no nonconstant functions with vanishing Hessian. In order for such functions to exist, $M$ must be locally isometric to a Riemannian product of an $n-1$ dimensional manifold and a $1$-dimensional one, putting significant restrictions on curvature (see here). Even if this splitting is possible, there are also global obstructions. One special case where you do get the full $n+1$-dimensonal set is when $M$ is flat and simply connected.
As for the integral $\mathcal{H}(f)$, it is not guranteed to converge at all, but when it does, we can note that $\mathcal{H}(cf+b)=c\mathcal{H}(f)$ for $c,b\in\mathbb{R}$, so you can obtain lots of functions with arbitrarily small $\mathcal{H}(f)$ simply by finding functions for which $\mathcal{H}(f)$ converges (e.g. compactly supported ones as noted in Ivo Terek's comment) and scaling.
Edit: A note on the flat case
If $M$ is flat, then the fundamental group $\pi_1(M,p)$ acts on $M$ by Euclidean transformations: given a loop $\gamma:[0,1]\to M$ based at $p$, we define $[\gamma]\cdot v=V(1)$, where $V$ is a vector field along $\gamma$ satisfying $V(0)=v$, $\dot{V}=\dot{\gamma}$. $d_pf$ must be invariant under this action, in that $d_pf(v)=d_pf([\gamma]\cdot v)$. In particular, the Hessian has the full $n+1$ dimensional kernel iff this action is trivial (it turns out this occurs iff $M$ can be isometrically immersed in $\mathbb{R}^n$). This is the only obstruction in the flat case, but this action tends to have few invariant 1-forms (e.g. none for flat compact space forms).
Best Answer
The relation between the Riemannian Hessian and the Hessian in Euclidean space is very simple - they are the same. More precisely, the Euclidean Hessian is a particular case of the more general Riemannian Hessian.
Let $(x_1,\ldots,x_n)$ be local coordinates on a neighborhood in a Riemannian manifold $M$, and let $\Gamma_{ij}^k$ denote the Christoffel symbols of the Levi-Civita connection with respect to these coordinates. Let us massage your definition for the Hessian; we consider the vector fields $$X=X^i\frac{\partial}{\partial x_i},\;Y=Y^i\frac{\partial}{\partial x_i}.$$ Then $$\begin{align}X(Y(f))-df(\nabla_XY)&=X^i\frac{\partial}{\partial x_i}\left(Y^j\frac{\partial f}{\partial x_j}\right)-df\left(\nabla_{X^i\frac{\partial}{\partial x_i}}Y^j\frac{\partial}{\partial x_j}\right)\\&=X^i\left(\frac{\partial Y^j}{\partial x_i}\frac{\partial f}{\partial x_j}+Y^j\frac{\partial^2f}{\partial x_i\partial x_j}\right)-df\left(X^i\left(\frac{\partial Y^j}{\partial x_i}\frac{\partial}{\partial x_j}+Y^j\Gamma_{ij}^k\frac{\partial}{\partial x_k}\right)\right)\\&=X^iY^j\frac{\partial^2f}{\partial x_i\partial x_j}-X^iY^j\Gamma_{ij}^k\frac{\partial f}{\partial x_k}.\end{align}$$Now, the Christoffel symbols of the Levi-Civita connection with respect to the usual coordinates on $\mathbb{R}^n$ are all zero. Hence, the Euclidean Hessian matrix of the function $f$ is just the matrix whose $(ij)$ entry is $$\mathrm{Hess}(f)_{ij}=\mathrm{Hess}(f)\left(\frac{\partial}{\partial x_i},\frac{\partial}{\partial x_j}\right),$$where the right hand side is the Riemannian Hessian.