If you know how to commute covariant derivatives of an arbitrary tensor, then this is not so hard: since the metric commutes with covariant derivatives, we have $$\begin{align}
\nabla_k \Delta T - \Delta \nabla_k T &= g^{ij}(\nabla_k \nabla_i \nabla_j T - \nabla_i \nabla_j \nabla_k T) \\
&= g^{ij}([\nabla_k, \nabla_i] \nabla_j T +\nabla_i\nabla_k\nabla_j T- \nabla_i \nabla_j \nabla_k T) \\
&= g^{ij}([\nabla_k, \nabla_i] \nabla_j T + \nabla_i([\nabla_k,\nabla_j] T))\tag{1}.\end{align}$$
Here I'm using the abstract index notation, so e.g. $\nabla_i \nabla_j \nabla_k T = (\nabla^3 T)(\partial_i, \partial_j, \partial_k, \ldots)$ is the covariant third derivative.
Since commutators of covariant derivatives always end up just being some contractions with the curvature, we can immediately see from $(1)$ that the formula will have the form $$[\nabla,\Delta]T = R* \nabla T + \nabla( R * T) = R * \nabla T + \nabla R * T \tag{2},$$ where the notation $S * T$ denotes an arbitrary linear combination of contractions of $S \otimes T$. Thus you should expect not only curvature terms, but also derivatives of the curvature. (That is, unless $n = 0$ - meaning $T$ is just a scalar - in which case the second term vanishes because the Hessian of a scalar is symmetric.) For many purposes
$(2)$ is all you need.
To get an explicit expansion of $(1)$ in terms of the curvature tensor, we first need to write down an expression for $[\nabla_k, \nabla_j] T$.$\def\t{\otimes}$ When we take the second covariant derivative of $T = T_{i_1 \ldots i_n}dx^{i_1}\t\cdots\t dx^{i_n}$ and fully expand using the product rule, all the cross terms will vanish thanks to the antisymmetrization; so the derivative commutator also satisfies a Leibniz rule: we find $$[\nabla_k, \nabla_j] T = [\nabla_k,\nabla_j](T_{i_1 \ldots i_n})dx^{i_1}\t\cdots\t dx^{i_n}\\ + T_{i_1 \ldots i_n} ([\nabla_k,\nabla_j]dx^{i_1})\t\cdots \t dx^{i_n} \\ \vdots \\+ T_{i_1 \ldots i_n} dx^{i_1}\t\cdots \t([\nabla_k,\nabla_j]dx^{i_n}).$$
Since the curvature tensor is characterized by $R_{kj}{}^i{}_l dx^l = [\nabla_k,\nabla_j]dx^i$ and the first two derivatives of a scalar commute, this becomes $$[\nabla_k, \nabla_j] T=T_{i_1 \ldots i_n}(R_{kj}{}^{i_1}{}_ldx^l \t \cdots \t dx^{i_n} + \cdots + R_{kj}{}^{i_n}{}_l dx^{i_1} \t \cdots \t dx^l),$$ which (doing some index gymnastics) we can write as
$$[\nabla_k,\nabla_j]T_{i_1\ldots i_n} = R_{kj}{}^l{}_{i_1} T_{l i_2 \ldots i_n}+\cdots + R_{kj}{}^l{}_{i_n}T_{i_1 \ldots i_{n-1} l}.$$
Since you can now commute derivatives of an arbitrary $n$-tensor, you can do it for an $n+1$-tensor too; so if you really want to write out the full mess you can plug this in to $(1)$ for the two commutator terms and then expand everything with the product rule. I don't think there's much insight to be gained by typing it all out - it's just going to be a mess of contractions, some of them of $R$ with $\nabla T$ and some of $\nabla R$ with $T$. Perhaps try it yourself for a very small value of $n$.
You're almost there. Using the Einstein summation convention, so that we sum over repeated indices, we have
$$\Delta \nabla_i f = \nabla^j \nabla_j \nabla_i f = \nabla^j \nabla_i \nabla_j f = - R^j{}_i{}^k{}_j \nabla_k f +\nabla_i \nabla^j \nabla_j f = g^{jk} \operatorname{Ric}_{ij} \nabla_k f + \nabla_i \Delta f.$$
The third equality follows from applying the Ricci identity for $1$-forms that you mention in your edit to $\nabla_j f$, and the fourth from the symmetries of the curvature tensor $R$ and the definitions of $\operatorname{Ric}$ and $\Delta$.
Best Answer
Personally I think it's a tragedy that no book explains how to use this kind of notation and we just have to learn it by guessing basically. But to spare you that suffering, here's an explanation I hope will help: you're right that $Z^i$ is not globally defined, but that doesn't matter. Here the author is only interested in working in local coordinates, and by the nature of a connection, that is harmless. What Aubin actually writes is that:
$$R_{kij}^{\ell}Z^k = \nabla_i \nabla_j Z^{\ell} - \nabla_{j}\nabla_{i}Z^{\ell}$$
Let's see what both sides of this equality mean. On the left side, we have a real number expressed in Einstein notation, and so on the right side we must have a real number as well. Actually, it's a bit more subtle than that and to say this would be skipping a few steps. On both sides of the above equality we have a real function. What this equality actually means is that once you fix a chart $x: U \to M$ around a point $q \in M$, it's true that:
$$(R_{kij}^{\ell}Z^k)(p) = (\nabla_i \nabla_j Z^{\ell} - \nabla_{j}\nabla_{i}Z^{\ell})(p)$$ for every $p \in M$, where here $\partial_i \doteq \frac{\partial}{\partial x^i}$ for convenience. Now, $Z^k$ is the function $Z^k: U \to \mathbb{R}$ defined by $Z^k(p) = \text{$k$-th component of $Z(p)$ in the basis $\{\partial_1, \cdots, \partial_n\}$}$. Similarly, $R_{kij}^{\ell}$ is the function $R_{kij}^{\ell}: U \to \mathbb{R}$ defined by $(R_{kij}^{\ell})(p) = \text{$\ell$-th component of $(R(\partial_i, \partial_j)\partial_k)(p)$ in the same basis}$. And $\nabla_i \nabla_j Z^{\ell}$ is the function defined by $$(\nabla_i \nabla_j Z^{\ell})(p) = \text{$\ell$-th component of $(\nabla_i \nabla_j Z)(p)$ in the same basis }$$
Of course we still have to check this equality is indeed valid. But this is clear: writing in this local chart $Z = Z^k \partial_{k}$, we get $R(\partial_i, \partial_j)Z = Z^k R_{kij}^{\ell} \partial_{\ell}$ almost by definition. Therefore:
$$Z^k R_{kij}^{\ell} = (R(\partial_i, \partial_j)Z)^{\ell} = (\nabla_i \nabla_j Z - \nabla_{j}\nabla_{i}Z)^{\ell} = \nabla_i \nabla_j Z^{\ell} - \nabla_{j}\nabla_{i}Z^{\ell}$$
as previously claimed.
As for the latter definition, here's what's happening in all this mess: in Ricci-calculus, working in local coordinates as before, if $X = X^{i} \partial_i$, we denote this just by writing $X = X^i$ sometimes. Now, for a fixed vector field $Y$, we can think of $\nabla Y$ as a $(1, 1)$ tensor field, given by $\Gamma(TM) \ni X \mapsto \nabla_{X} Y \in \Gamma(TM)$. We define the trace of a $(1,1)$ tensor $T$ as $\text{tr}(T): M \to \mathbb{R}$, $p \mapsto (\text{tr}(T))(p) = \displaystyle{\sum_{i} g(A(E_i), E_i)(p)}$, where $\{E_1, \cdots, E_n\}$ is an orthonormal basis of $T_p M$. I'll leave the job of checking this definition doesn't depend on the basis and how to make sense of $\nabla Y$ as a $(1, 1)$ tensor to you. Here are some more lazy notations in order to make sense of this: for a vector field $\eta = (\eta^1, \cdots, \eta^k)$ in local coordinates, we write: $$\nabla_i \eta^k = \frac{\partial \eta^k}{\partial x^i} + \Gamma_{ij}^{k}\eta^j$$ where as before, this means the $k$-th component of $\nabla_i \eta$ in local coordinates. We might even get lazy and write $\text{div}(X^i) = \nabla_i X^i, \text{div}(\eta^j) = \nabla_{i} \eta^i$. Letting $\text{grad}(f)$ be the vector field defined by $g(\text{grad}f, \cdot) = \text{d}f(\cdot)$, sometimes we also write $f^i = \text{grad}(f) = \nabla^{i}f$. Defining $\Delta f = \text{div}(\text{grad}(f))$ and being incredibly lazy, we then get the notation $\Delta f = \nabla_i \nabla^i f$.