EDIT: I'm leaving this up as background reading to @drake's answer. (The point of the following is that the path integral does indeed give the correct time ordering, so it is producing the correct $\theta$-function weighted, time-ordered sums, which must be accounted for when differentiating its output.)
The two formalisms are equivalent; if they don't give the same result, something is wrong in the calculation. To see this you have to understand a subtlety which is not usually well-explained in textbooks, namely that the path integral is not defined merely by taking the limit of a bunch of integrals of the form $\int_{\mbox{lattice fields}} e^{iS(\phi)} d\phi$.
The problem is that these finite-dimensional integrals are not absolutely convergent, because $|e^{iS(\phi)}| = 1$. To define even the lattice path integral in Minkowski signature, you have to specify some additional information, to say exactly what is meant by the integral.
In QFT, the additional information you want is that the path integral should be calculating the kernel of the time evolution operator $e^{iH\delta t}$, which is an analytic function of $\delta t$. This fact is usually expressed by saying that the Minkowski signature path integral is the analytic continuation of a Euclidean signature path integral: The Euclidean $n$-point functions $E(y_1,...,y_n)$ defined by
$E(y_1,...,y_n) = \int \phi(y_1)...\phi(y_n) e^{-S_E(\phi)} d\phi$
are analytic functions of the Euclidean points $y_i \in \mathbb{R}^d$. This function $E$ can be continued to a function $A(z_1,...,z_n)$ of $n$ complex variables $z_i \in \mathbb{C}^d$. This analytic function $A$ does not extend to the entire plane; it has singularities, and several different branches. Each branch corresponds to a different choice of time-ordering. One branch is the correct choice, another choice is the 'wrong sign' time-ordering. Other choices have wrong signs on only some subsets of the points. If you restrict $A$ to the set $B$ of boundary points of the correct branch, you'll get the Minkowski-signature $n$-point functions $A|_B = M$, where $M(x_1,...,x_n) = \langle \hat{\phi}(x_1)...\hat{\phi}(x_n)\rangle_{op}$ and the $x_i$ are points in Minkowski space.
In perturbation theory, most of this detail is hidden, and the only thing you need to remember is that the $+i\epsilon$ prescription selects out the correct time-ordering.
Best Answer
Up to a universal normalization factor, $\exp(iS_{\rm history}/\hbar)$ is the probability amplitude for the physical system to evolve along the particular history. All the complex numbers in quantum mechanics are "probability amplitudes of a sort".
This is particularly clear if we consider the sum over histories in a short time interval $(t,t+dt)$. In that case, all the intermediate histories may be interpreted as "linear fields" – for example, a uniform motion $x(t)$ – which is why only the straight line contributes and its meaning is nothing else than the matrix elements of the evolution operator.
It may be puzzling why all the histories have probability amplitudes with the same absolute value. But it's true – in the sense of Feynman's path integral – and it's how Nature operates. At the end, some histories (e.g. coarse-grained histories in the sense of the consistent history interpretation of quantum mechanics) are more likely or much more likely than others. But within quantum mechanics, all these differences between the likelihood of different coarse-grained histories are explained by constructive or destructive interference of the amplitudes (and/or from different "sizes of the ensemble of histories")! That's just the quantum mechanics' universal explanation for why some histories are more likely than others. Histories that are not coarse-grained are equally likely!