I think some of your confusion stems from the fact that there are two different kinds of vacuua in QFT. First there is the vacuum of the free theory, usually denoted $|0\rangle $, second there is the full (interacting) vacuum, usually denoted $|\Omega \rangle$.
What we want to calculate are the different quantities in the full theory like:
\begin{equation}
\langle\Omega\vert T\phi(x_1)\phi(x_2)\vert\Omega\rangle.
\end{equation}
What you have written in equation (3) is acutally the propagator in the free theory (see that you use $\mathcal{L}_0$ and not $\mathcal{L}$ as it should be). This may seem like a minor point but the difference is essential to understand. (You must have copied it wrong from eqn (9.18) in P & S, since it looks correct there.
The problem is that we do cannot calculate this directly. What we do is we basically say: What if the full theory is almost like the free theory but with a small interaction term? That is, we do perturbation theory. This is because we know the propagator of the free theory:
$$ \Delta_F(x_1-x_2) =\langle0\vert T\phi(x_1)\phi(x_2)\vert0\rangle = \frac{\int \mathcal{D}\phi\; \phi(x_1)\phi(x_2) e^{iS_0}}{\int \mathcal{D}\phi\; e^{iS_0}}.$$
The trick is then to rewrite everything in terms of the propagator of the free theory.
The actual derivation of the full 2-point function is fairly involved, but you can find it in P & S pages 82-99.
I assume that we are working with a fixed order in perturbation theory, so that the set of all possible Feynman graphs is finite. I also assume that the theory has been regularized such that all graphs evaluate to finite amplitudes.
We call a bubble graph a Feynman diagram with no external legs.
We call a connected graph a Feynman diagram such that among its connected components there are no bubbles. Important: note that this definition, despite being used heavily in many QFT textbooks, is somewhat counterintuitive. Usually a connected graph is a graph with a single connected component. Not so here!
Consider an arbitrary Feynman graph $G$. We can naturally split it into the connected subgraph $G_0$ (in the sense defined above) and the bubble graph $G_b$, which in the general case is a union of bubbles. It can be obtained directly from the Feynman rules that
$$ \text{Amp}(G) = \text{Amp}(G_0) \cdot \text{Amp}(G_b). $$
Now consider a sum of all possible Feynman graphs with some fixed set of external legs. This sum is over the Cartesian product of connected graphs (in the sense defined above) and bubble graphs. Thus the sum factors out as
$$ \sum \text{graphs} = \sum \text{connected graphs} \cdot \sum \text{bubble graphs}. $$
This result is referred to as "bubble factorization formula".
Now note that the denominator in your formula is just a special case of the numerator where there's no field operators before $U(t,-t) = T \exp \{-i \int_{-t}^t dt V(t) \}$, or in our diagrammatic terminology, where there's no external legs. There's only one connected graph with no external legs which is the empty graph $\emptyset$ with no vertices and no edges, for which (again, directly from Feynman rules)
$$ \text{Amp} (\emptyset) = 1. $$
Thus we conclude that the denominator is just a sum of bubble graphs, and from the bubble factorization formula we conclude that the fraction is equal to the sum of connected graphs.
All this reasoning can be made precise if we introduce an upper order of perturbation theory and a regularization scheme, as it was assumed in my answer.
Best Answer
Maybe I am missing something but
\begin{equation} \sum_{n \neq 0} e^{-i E_n (T-i\epsilon)} |n \rangle \langle n|0\rangle =\sum_{n \neq 0} e^{-i E_n T} e^{-\epsilon E_n } |n \rangle \langle n|0\rangle \end{equation} and not a sum of series. This really allows you then to kill non zero $n$ as $T$ goes to infinity.
As for the physical meaning of this trick, you can imagine it (I imaginie it like that at least) as an account for vanishing mixing instead of purely zero mixing. In other word you add some kind of temperature to your system that you know tend to select out the ground state and then you take the limit when the temperature goes to zero.
Essentially what happens is that the initial state $|0\rangle$ is not an eigenstate of the hamiltonian $H$. Because of that the state evolution is best expressed by expanding it in the eigenstate basis set $|n\rangle$. Now, you can imagine all these states being visited every now and then during the evolution (that's what the above equation states).
A system is always coupled in some way to a thermostat even if its temperature is vanishing. Over a long period, the frequency with which the quantum oscillations will allow each state $|n\rangle$ to be visited will be roughly given by a Boltzmann weight. That's what I meant with the term "mixing", your initial state is "shared" among all the eigenstates and eventually, one can estimate the fraction of it that goes to each $|n\rangle$. In the case of a vanishing temperature, the ground state is greatly favored which explains the result.
Instead of a trick, it is actually more a reminder of what should be the true calculation if it were to be done with the density matrix instead of pure states.