There is one important piece of evidence that has not been mentioned in the "comments" above, namely, de Rham's theorem for compact manifolds, and the idea that harmonic forms uniquely represent (real) cohomology classes. That certainly testifies to the significance of the Laplacian (the kernel of which consists of harmonic forms).
(1) Note that $L_X Y = [X,Y]= \nabla_X Y - \nabla_Y X$. Moreover, on $\mathbb{R}^n$, the covariant derivative is given by $\nabla_X Y = (X(Y_1), \ldots, X(Y_n))$. If $\partial_i$ is the $i$-th coordinate vector field on $\mathbb{R}^n$, then
$$
\begin{align*}
L_{\nabla f}g(\partial_i,\partial_j) &= (\nabla f)(g(\partial_i,\partial_j))-g([\nabla f,\partial_i],\partial_j) - g(\partial_i, [\nabla f, \partial_j]) \\
&= (\nabla f)(\delta_{ij}) + g(\partial_i \nabla f,\partial_j) + g(\partial_i, \partial_j \nabla f) \\
&= 0 + \partial_i\partial_j f + \partial_j \partial_i f = 2\partial_i\partial_j f.
\end{align*}
$$
And this is equal to the $(i,j)$-th entry of the Hessian matrix of $f$ (apart from the factor 2 of course).
(2) For all smooth vector fields $X$, $Y$ we have
$$
\begin{align*}
L_{\nabla f}g(X,Y) &= (\nabla f)(g(X,Y)) - g([\nabla f,X], Y)- g(X,[\nabla f, Y])\\
&= g(\nabla_{\nabla f} X, Y) +g(X,\nabla_{\nabla f}Y) \\
&\qquad- g(\nabla_{\nabla f}X-\nabla_X \nabla f, Y)-g(X,\nabla_{\nabla f}Y- \nabla_Y\nabla f)\\
&= g(\nabla_X \nabla f, Y) + g(X,\nabla_Y\nabla f).
\end{align*}
$$
If we show that the two last terms are equal, then we are done. This follows from the definition of a gradient and the fact that the covariant derivative is torsion free (because we are working with the Levi-Civita connection).
$$
\begin{align*}
g(\nabla_X \nabla f,Y) -g(X,\nabla_Y\nabla f) &= X g(\nabla f,Y) - g(\nabla f, \nabla_X Y) \\
& \qquad- Y g(\nabla f,X) + g(\nabla f,\nabla_Y X) \\
&= X(Y(f)) - Y(X(f)) - [X,Y](f) = 0
\end{align*}
$$
Best Answer
$\def\tr{\mathrm{tr}}\def\dv{\mathrm{div}}\def\gr{\mathrm{grad}}\def\o{\circ}\def\n{\nabla}$You have a small error in your question: the trace you are describing is the trace of a $(0,2)$-tensor $h$ with respect to the metric $g$, which I write $\tr_gh$, reserving $\tr$ for the invariant trace $\tr\, T = T^i_i$ of a $(1,1)$-tensor $T$.
The equality of these two definitions of the Laplacian essentially just falls out of the definitions. The divergence operator is defined on vector fields as the contraction of the covariant derivative, so $\dv = \tr \o \n$. Thus $$\Delta := \dv \o \gr = \tr \o \n \o \gr.$$ Since the gradient is just what we get by raising an index of the covariant derivative using the metric, this becomes $$\Delta = \tr_g \o (\n \o \n) = \tr_g \o \mathrm{Hess},$$ where we used the metric-compatibility of $\nabla$ to commute the metric past the derivative. In abstract index notation this whole calculation is just $$\Delta f = \nabla_i (g^{ij} \nabla_j f)=g^{ij}\nabla_i\nabla_j f.$$