General calculation in Schrodinger picture
We can do a straightforward calculation of your setup in the Schrodinger picture. Let $x,y$ label points in space. Before $t=0$, the state is in the vacuum $|0\rangle$. Just after $t=0$, the state is $e^{-i \epsilon_s \phi(x)} |0\rangle$, for some spatial point $x$ where you place the source at $t=0$. (Really you might want to consider turning on a source smeared over a small neighborhood of $x$, to avoid singularities in the following discussion, but I'll ignore that.) At later time $t>0$, we have state
$$|\psi(t)\rangle = e^{-iHt}e^{-i \epsilon_s \phi(x)} |0\rangle
=e^{-iHt}e^{-i \epsilon_s \phi(x)} e^{iHt} |0\rangle
= e^{-i \epsilon_s \phi(x,t)}|0\rangle$$
Then at time $t$, we make a measurement at spatial point $y$. For a moment I'll ignore your desired detector model and speak generally. Say we make a measurement in a spatial region $Y$. The observables measurable by an observer local to $Y$ are precisely the observables generated by (sums and products of) operators $\phi(y), \pi(y)$ for any $y \in Y$. Choose an observable $A_Y$ of this form, e.g. $A_Y = \phi(y)$ for some $y \in Y$, or $A_Y = \int_{y \in Y}\phi(y)\, dy$. Assume the points $(Y,t)$ are spacelike from the point $(x,t=0)$. Then the expectation of $A_Y$ in $|\psi(t)\rangle$ is
$$\langle \psi(t) | A_Y |\psi(t)\rangle = \langle 0 |e^{i \epsilon_s \phi(x,t) } A_Y e^{-i \epsilon_s \phi(x,t)} | 0 \rangle = \langle 0 | A_Y |0 \rangle.$$
So the expectation of $A_Y$ is the same as if you hadn't turned on the source at $x$ at $t=0$. The second equality uses $[\phi(x,t),A_Y]=0$, by assumed spacelike separation.
Being careful with first-order expansion in $\epsilon$
Here's a possible point of confusion. Just after $t=0$, and to first order in $\epsilon_s$, we have
$$ |\psi \rangle = |0\rangle - i \epsilon_s \phi(x) |0\rangle + O(\epsilon_s^2).$$
The second term $\phi(x) |0\rangle$ seems to dominate over the higher-order terms in $\epsilon_s$, and this may seem to suggest causality violation: an observable $A_Y$ at spacelike $Y$ still has nonzero expectation value in this state, i.e. $\langle 0 | \phi(x) A_Y \phi(x) |0\rangle \neq 0$. (Incidentally if you choose $A_Y=\phi(y)$ the expectation value is zero by symmetry, but you could choose e.g. $A_Y=\pi(y)$.). However, this observation is entirely unproblematic; the expectation value of $A_Y$ with respect to just the first-order term is not directly related to any measurement. If we include both the zero'th order and first-order terms in $|\psi\rangle$, then we find $\langle \psi | A_Y |\psi\rangle = \langle 0 | A_Y |0 \rangle + O(\epsilon^2)$, because the first-order contributions cancel.
Locality of your detector model
What about your detector model? There were a few problems. First, I wouldn't actually call your detector localized to the point $y$. The observables measurable at $y$ are just algebraic combinations of $\phi(y), \pi(y)$. Or again more generally, you could take a small spatial region $Y$ and consider observables $A_Y$ local to $Y$, given by algebraic combinations of $\phi(y), \pi(y)$ for $y \in Y$. If you want to imagine an external system $S$ like your atom coupled locally to $Y$, the coupling Hamiltonian for the detector should be like
$$H_{det} = \sum_i O^i_S A^i_Y$$
where $O^i_S$ are some operators on the coupled system $S$, and $A^i_Y$ are operators in the QFT local to $Y$.
Your desired Hamiltonian $H_{det}$ may look like it takes this form, but your operators $\phi_{\pm}(x)$, by which I assume you mean something like
$$\phi_{-}(x) \equiv \int \frac{d^3 p}{(2\pi)^3} \frac{1}{\sqrt{E_p}} a_p e^{ipx}$$
$$\phi_{+}(x) \equiv \int \frac{d^3 p}{(2\pi)^3} \frac{1}{\sqrt{E_p}} a_p^\dagger e^{-ipx},$$
are not strictly local to $y$. One way to see this is that they have nonzero commutator with any $\phi(x)$. (Incidentally there's some discussion about this in Section 6 and Eq. 81 here.). The operators $\phi_+(y)$ and $\phi_-(y)$ may look like they are local to $y$, by the way they are written, but if you actually re-write $\phi_{\pm}(x)$ in terms of the genuinely local $\phi(x)$ and $\pi(x)$ operators, you will find the $\phi_{\pm}(x)$ are not local.
Moreover, regardless of your detector model, I think you're a bit too quick when you say "the amplitude for the detector to be in the excited state$\dots$." You should actually think about what a measurement of the detector subsystem would yield. The analysis will then go similarly to the general discussion at the beginning of the answer.
Finally, what if we insist on using your particular detector $H_{det}$, using $\phi_{\pm}(y)$ couplings? First we must admit it's not truly local to $y$, and that it's really only approximately "localized" to a region of radius $\approx \frac{1}{m}$ around $y$ (for a massive theory). We must further admit that it's not even strictly localized to that neighborhood, or any finite region: it's really only "local" to a neighborhood of radius $r$ around $y$ with error $e^{-\frac{r}{m}}$, due to the nonzero commutators $[\phi_{\pm}(y), \phi(x)]$, or the expression of $\phi_{\pm}(y)$ in terms of genuinely local field operators $\phi(x), \pi(x)$. So you shouldn't be surprised if the detector has a $e^{-\frac{r}{m}}$ probability of registering a causality-violating signal in this model.
Best Answer
I'm digging this thread out just to clarify some things for those who might have a similar question.
Summary
We cannot use $\mathcal T$. Space-like four-vectors are essentially like $(0,x,y,z)$, so we can ignore the time and do three-dimensional rotations to get $(0,-x,-y,-z)=-(0,x,y,z)$.
A la Valter Moretti
As Valter Moretti already pointed out, you cannot just apply $\mathcal P\mathcal T$ to get $(x-y)\to-(x-y)$, because $D(x-y)$ is not invariant under $\mathcal T$.
So the challenge is really to do $(x-y)\to-(x-y)$ using only proper orthochronous Lorentz transformations $SO(1,3)_+$ and $\mathcal P$. This is only possible for space-like four-vectors.
The point about space-like four-vectors is that there is a Lorentz-frame where $t=0$ (boost with $\beta=\frac{t}{|\vec x|^2}$), and in such a frame the parity transformation $$\mathcal P:(0,x',y',z')\to(0,-x',-y',-z')=-(0,x',y',z')$$ looks just like an inversion. So what you can do for space-like four-vectors is $$ (t,x,y,z) \overset{\Lambda}{\to}(0,x',y',z') \overset{\mathcal P}{\to}-(0,x',y',z') \overset{\Lambda^{-1}}{\to}-(t,x,y,z) $$
The difference between this transformation and $\mathcal P\mathcal T$ is that the latter takes all four-vectors to their inverses, whereas the former only a (three dimensional) subspace of the four-dimensional Minkowski space.
A la Peskin and Schroeder
You can actually achieve the same without using $\mathcal P$, that is only with $SO(1,3)_+$ transformations. This means we can continuously bring a fixed space-like vector $p$ to its inverse $-p$. Just do the following steps: \begin{align*} (t,x,y,z) &\overset{R_1}{\to}\left(t,\sqrt{x^2+y^2},0,z\right)\\ &\overset{R_2}{\to}\left(t,\sqrt{x^2+y^2+z^2},0,0\right)\\ &\overset{B\left(\beta=\frac{t}{|\vec x|^2}\right)}{\to}\left(0,\sqrt{x^2+y^2+z^2-t^2},0,0\right)\\ &\overset{R_\pi}{\to}-\left(0,\sqrt{x^2+y^2+z^2-t^2},0,0\right)\\ &\overset{\left(BR_2R_1\right)^{-1}}{\to}-\left(t,x,y,z\right) \end{align*} In view of this one should really say that space-like vectors are like $(0,x,0,0)$.
Conclusion
Space-like four-vectors should be thought of as $(0,x,0,0)$, and since there are three spacial dimensions, there is enough room to rotate this vector in any direction. This allows us to invert space-like vectors just by using proper ortochronous transformations $SO(1,3)_+$.
Time-like four-vectors are like $(t,0,0,0)$. There is only one time direction, and hence no rotations are possible. Hence the only way of getting $-t$ is to use time inversion $\mathcal T$.
Short, because there is only one time dimension, but more than one space dimensions, we can invert space-like four vectors by continuous Lorentz-rotations, but not time-like.