Field theories are nonlinear because the quantum fields satisfy nonlinear dynamical equations.
But renormalization does not make quantum fields into a nonlinear functional of test functions. The Wightman distributions are, by definition, linear functionals of the test functions, and Wightman distributions always encode renormalized fields.)
Instead it changes the space of test functions to one where the interacting quantum fields are perturbatively well-defined. This gives a family of representations of the field algebra depending on an energy scale. All these representations are equivalent, due to the renormalization group, and the corresponding Wightman functions are independent of the renormalization energy. (In simpler, exactly solvable toy examples that need infinite renormalization, this can actually be checked.)
The dependence on the energy scale would not be present if contributions to all ordered were summed up (though nobody has the slightest idea how to do this nonperturbative step). The energy scale is simply a redundant parameter the influences the approximations calculated by perturbation theory.
The renormalization group is an exact but unobservable symmetry (just like gauge symmetry) that removes this extra freedom, but as computations in a fixed gauge may spoil gauge-independence numerically, so computations at a fixed energy scale spoil renormalization group invariance numerically.
Note that Wightman functions are in principle observable. Indeed, the Kadanoff-Baym equations, the equations modeling high energy heavy ion collision experiments. are dynamical equations for the 2-particle Wightman functions and their ordered analoga.
[added 22.01.2018] In the above, the renormalization group refers to the group defined by Stúckelberg an Bogoliubov, not to that by Kadanoff and Wilson, which is only a semigroup. See here.
EDIT: I'm leaving this up as background reading to @drake's answer. (The point of the following is that the path integral does indeed give the correct time ordering, so it is producing the correct $\theta$-function weighted, time-ordered sums, which must be accounted for when differentiating its output.)
The two formalisms are equivalent; if they don't give the same result, something is wrong in the calculation. To see this you have to understand a subtlety which is not usually well-explained in textbooks, namely that the path integral is not defined merely by taking the limit of a bunch of integrals of the form $\int_{\mbox{lattice fields}} e^{iS(\phi)} d\phi$.
The problem is that these finite-dimensional integrals are not absolutely convergent, because $|e^{iS(\phi)}| = 1$. To define even the lattice path integral in Minkowski signature, you have to specify some additional information, to say exactly what is meant by the integral.
In QFT, the additional information you want is that the path integral should be calculating the kernel of the time evolution operator $e^{iH\delta t}$, which is an analytic function of $\delta t$. This fact is usually expressed by saying that the Minkowski signature path integral is the analytic continuation of a Euclidean signature path integral: The Euclidean $n$-point functions $E(y_1,...,y_n)$ defined by
$E(y_1,...,y_n) = \int \phi(y_1)...\phi(y_n) e^{-S_E(\phi)} d\phi$
are analytic functions of the Euclidean points $y_i \in \mathbb{R}^d$. This function $E$ can be continued to a function $A(z_1,...,z_n)$ of $n$ complex variables $z_i \in \mathbb{C}^d$. This analytic function $A$ does not extend to the entire plane; it has singularities, and several different branches. Each branch corresponds to a different choice of time-ordering. One branch is the correct choice, another choice is the 'wrong sign' time-ordering. Other choices have wrong signs on only some subsets of the points. If you restrict $A$ to the set $B$ of boundary points of the correct branch, you'll get the Minkowski-signature $n$-point functions $A|_B = M$, where $M(x_1,...,x_n) = \langle \hat{\phi}(x_1)...\hat{\phi}(x_n)\rangle_{op}$ and the $x_i$ are points in Minkowski space.
In perturbation theory, most of this detail is hidden, and the only thing you need to remember is that the $+i\epsilon$ prescription selects out the correct time-ordering.
Best Answer
@ArnoldNeumaier and @dushya have both pointed out correct solutions, but I want to elaborate a bit.
The easy approach is the one dushya suggested. (You can also do what Arnold Neumaier suggests: First define the time-ordered product of operator-valued distributions, and then take expectation values.)
Begin by recalling how Wightman functions are defined. The correlation function of $n$ smeared fields is a multilinear functional $(f_1,...,f_n) \mapsto \langle vac | \hat{\phi}(f_1) ... \hat{\phi}(f_n)|vac\rangle$. You can use the Schwarz nuclear theorem to prove this VEV has a kernel $W$ function such that
$\langle vac |\hat{\phi}(f_1) ... \hat{\phi}(f_n)|vac\rangle = \int f(x_1)...f(x_n) W(x_1,...,x_n) dx_1...dx_n$
This $W$ is a Wightman function. Morally, $W$ is $\langle vac | \hat{\phi}(x_1)...\hat{\phi}(x_n)|vac\rangle$.
Now, if you want to define a time-ordered correlation function, all you have to do is permute the arguments in the Wightman function.
There's a somewhat more conceptual approach available, via analytic continuation from Minkowski signature to Euclidean and then back.
You can prove that the Wightman functions $W(x_1,...x_n)$ are boundary values of an analytic function defined on (a domain in) the product of $n$-copies of the complexification of Minkowski space. If you restrict this analytic function to the Euclidean subspace, the function you get -- known as a Schwinger function -- is permutation-invariant. (It has to be, because it's the output of a Euclidean functional integral constructed with only commuting variables.)
So where did the ordering of operators go? It's in the analytic continuation. The Schwinger functions are analytic, but not entire. The analytic continuation of a Schwinger function has one branch for each of the possible orderings of the arguments. So you can get the time-ordered correlation functions by continuing the Wightman functions from Minkowski space to Euclidean space and then looping back along a different branch.
The basic picture above -- Wightman functions and their analyicity -- is explained beautifully in Streater & Wightman's book PCT, Spin, Statistics, and All That. It's also in the 2nd of Kazhdan's IAS lectures. (Although the heavy lifting needed to show that the time-ordered Wightman functions are well-defined seems to be in Epstein & Eckmann's Time-ordered products and Schwinger functions.) It's also used regularly in particle physics texts, when computing propagators in momentum space. (Have a look at the contour integrals in Peskin & Schroeder's discussion of the Klein-Gordon propagator.)