EDIT: I'm leaving this up as background reading to @drake's answer. (The point of the following is that the path integral does indeed give the correct time ordering, so it is producing the correct $\theta$-function weighted, time-ordered sums, which must be accounted for when differentiating its output.)
The two formalisms are equivalent; if they don't give the same result, something is wrong in the calculation. To see this you have to understand a subtlety which is not usually well-explained in textbooks, namely that the path integral is not defined merely by taking the limit of a bunch of integrals of the form $\int_{\mbox{lattice fields}} e^{iS(\phi)} d\phi$.
The problem is that these finite-dimensional integrals are not absolutely convergent, because $|e^{iS(\phi)}| = 1$. To define even the lattice path integral in Minkowski signature, you have to specify some additional information, to say exactly what is meant by the integral.
In QFT, the additional information you want is that the path integral should be calculating the kernel of the time evolution operator $e^{iH\delta t}$, which is an analytic function of $\delta t$. This fact is usually expressed by saying that the Minkowski signature path integral is the analytic continuation of a Euclidean signature path integral: The Euclidean $n$-point functions $E(y_1,...,y_n)$ defined by
$E(y_1,...,y_n) = \int \phi(y_1)...\phi(y_n) e^{-S_E(\phi)} d\phi$
are analytic functions of the Euclidean points $y_i \in \mathbb{R}^d$. This function $E$ can be continued to a function $A(z_1,...,z_n)$ of $n$ complex variables $z_i \in \mathbb{C}^d$. This analytic function $A$ does not extend to the entire plane; it has singularities, and several different branches. Each branch corresponds to a different choice of time-ordering. One branch is the correct choice, another choice is the 'wrong sign' time-ordering. Other choices have wrong signs on only some subsets of the points. If you restrict $A$ to the set $B$ of boundary points of the correct branch, you'll get the Minkowski-signature $n$-point functions $A|_B = M$, where $M(x_1,...,x_n) = \langle \hat{\phi}(x_1)...\hat{\phi}(x_n)\rangle_{op}$ and the $x_i$ are points in Minkowski space.
In perturbation theory, most of this detail is hidden, and the only thing you need to remember is that the $+i\epsilon$ prescription selects out the correct time-ordering.
For SU(N), the adjoint representation can be obtained from a tensor product of the fundamental representation with its dual and projecting out the scalar. Thus we can replace the adjoint representation indices
$a, ...$ by double indices $i \bar{j}$ and write this relation as:
$$W^{i \bar{j}}_{k \bar{l}} = U^i_k U^{\dagger j}_l - \frac{1}{N} \delta^i_k \delta^j_l$$
(The factor $\frac{1}{N}$ can be obtained by taking the traces in the case $U=\mathbb{1}$).
On the other hand, the generators of the Lie algebra are the traceless
Hermitian matrices, which have the following form in the double index
notation:
$$ T^{i\bar{j}} = \frac{1}{\sqrt{2}}(E^{ij} - \frac{\mathbf{1}}{N} \delta^{ij})$$
The prefactor consists of a standard normalization and $\mathbb{1}$ is the unit matrix in the fundamental representation.
Substituting the second equation into the required identity gives you the first equation.
Best Answer
The time-evolution operator in the interaction picture can be written as: $$U(t,t_{0})=e^{iH_{0}t}e^{-iH(t-t_{0})}e^{-iH_{0}t_{0}}$$ Using this we can write: $$U(t_{1},t_{2})U(t_{2},t_{0})=e^{iH_{0}t_{1}}e^{-iH(t_{1}-t_{2})}e^{-iH_{0}t_{2}}e^{iH_{0}t_{2}}e^{-iH(t_{2}-t_{0})}e^{-iH_{0}t_{0}}=e^{iH_{0}t_{1}}e^{-iH(t_{1}-t_{0})}e^{-iH_{0}t_{0}}$$ So: $$U(t_{1},t_{2})U(t_{2},t_{0})=U(t_{1},t_{0})$$
From Tomonaga-Schwinger equation: $$i\partial_{t}U(t,t_{0})=H_{I}(t)U(t,t_{0})$$ We can write the time-evolution operator using the initial condition $U(t_{0},t_{0})=1$: $$U(t,t_{0})=1-i\int_{t_{0}}^{t}dt_{1}H_{I}(t_{1})U(t_{1},t_{0})$$ By iteration, we obtain that: $$U(t,t_{0})=1+(-i)\int_{t_{0}}^{t}dt_{1}H_{I}(t_{1})+(-i)^2\int_{t_{0}}^{t}dt_{1}\int_{t_{0}}^{t_{1}}dt_{2}H_{I}(t_{1})H_{I}(t_{2})+(-i)^3\int_{t_{0}}^{t}dt_{1}\int_{t_{0}}^{t_{1}}dt_{2}\int_{t_{0}}^{t_{2}}dt_{3}H_{I}(t_{1})H_{I}(t_{2})H_{I}(t_{3})+\dots$$ i.e., $$U(t,t_{0})=\sum_{i=0}^{\infty}(-i)^n\int_{t_{0}}^{t}dt_{1}\int_{t_{0}}^{t_{1}}dt_{2}\dots \int_{t_{0}}^{t_{n-1}}dt_{n}H_{I}(t_{1})H_{I}(t_{2})\dots H_{I}(t_{n})$$
$$U(t,t_{0})=\sum_{i=0}^{\infty}\frac{(-i)^n}{n!}\int_{t_{0}}^{t}dt_{1}\int_{t_{0}}^{t_{1}}dt_{2}\dots \int_{t_{0}}^{t_{n-1}}dt_{n}\mathcal{T}\left(H_{I}(t_{1})H_{I}(t_{2})\dots H_{I}(t_{n})\right)$$ $$U(t,t_{0})=\mathcal{T}\exp\left(-i\int_{t_{0}}^{t}dt'H_{I}(t')\right)$$