First, observe that $S = 1$ only if $X_1 = 1$ and $X_2 = 1$ and $X_3 = 1$. Given that we have drawn $n$ samples and $T$ of them are equal to one, the probability that $S = 1$ is:
$$P(S=1) = \max\left(0, {T \over n}{T-1\over n-1}{T-2\over n-2}\right)$$
where the $(T-1)/(n-1)$ and $(T-2)/(n-2)$ terms come about because we are sampling without replacement from our $n$ observations; for example, once we know that $X_1 = 1$, there are only $n-1$ values left that $X_2$ might take on, and only $T-1$ of them are equal to one. If $T < 2$, the probability that $S=1$ is of course 0, not negative.
Consequently, as $\mathbb{E_{\theta}}(X_1X_2X_3\mid T) = 1\cdot P(S=1) + 0\cdot P(S=0)$, which just equals $P(S=1)$,
$$\mathbb{E_{\theta}}(X_1X_2X_3\mid T) = \max\left(0, {T \over n}{T-1\over n-1}{T-2\over n-2}\right)$$.
Let's try an experiment to see how much Rao-Blackwell has improved our estimator. We're going to use a smallish sample size, $n=10$, and set $\theta = 0.6$. The code below is horribly inefficient but, for people who may not be so familiar with R, should be pretty easy to follow:
n <- 10
theta <- 0.6
dt <- data.frame(Unbiased = rep(0,100000),
Rao_Blackwell = rep(0,100000))
for (i in 1:nrow(dt)) {
T_sample <- rbinom(n, 1, theta)
dt$Unbiased[i] <- T_sample[1] * T_sample[2] * T_sample[3]
nT <- sum(T_sample)
dt$Rao_Blackwell[i] <- max(0, nT * (nT-1) * (nT-2) / (n * (n-1) * (n-2)))
}
Now for the output:
> dt <- dt - theta^3
>
> # Check the bias; it should be tiny, due only to sampling
> colMeans(dt)
Unbiased Rao_Blackwell
0.0011700000 0.0007798333
>
> # Compare the variances
> apply(dt, 2, var)
Unbiased Rao_Blackwell
0.17000889 0.03240376
>
Rao-Blackwellization has reduced the variance of the estimator by a factor of six, far more than the factor of 3.3 which is the ratio of the complete sample size (10) to the number of observations used to construct the initial unbiased estimator (3).
The Poisson distribution is a one-parameter exponential family distribution, with natural sufficient statistic given by the sample total $T(\mathbf{x}) = \sum_{i=1}^n x_i$. The canonical form is:
$$p(\mathbf{x}|\theta) = \exp \Big( \ln (\theta) T(\mathbf{x}) - n\theta \Big) \cdot h(\mathbf{x}) \quad \quad \quad h(\mathbf{x}) = \coprod_{i=1}^n x_i! $$
From this form it is easy to establish that $T$ is a complete sufficient statistic for the parameter $\theta$. So the Lehmann–Scheffé theorem means that for any $g(\theta)$ there is only one unbiased estimator of this quantity that is a function of $T$, and this is the is UMVUE of $g(\theta)$. One way to find this estimator (the method you are using) is via the Rao-Blackwell theorem --- start with an arbitrary unbiased estimator of $g(\theta)$ and then condition on the complete sufficient statistic to get the unique unbiased estimator that is a function of $T$.
Using Rao-Blackwell to find the UMVUE: In your case you want to find the UMVUE of:
$$g(\theta) \equiv \theta \exp (-\theta).$$
Using the initial estimator $\hat{g}_*(\mathbf{X}) \equiv \mathbb{I}(X_1=1)$ you can confirm that,
$$\mathbb{E}(\hat{g}_*(\mathbf{X})) = \mathbb{E}(\mathbb{I}(X_1=1)) = \mathbb{P}(X_1=1) = \theta \exp(-\theta) = g(\theta),$$
so this is indeed an unbiased estimator. Hence, the unique UMVUE obtained from the Rao-Blackwell technique is:
$$\begin{equation} \begin{aligned}
\hat{g}(\mathbf{X})
&\equiv \mathbb{E}(\mathbb{I}(X_1=1) | T(\mathbf{X}) = t) \\[6pt]
&= \mathbb{P}(X_1=1 | T(\mathbf{X}) = t) \\[6pt]
&= \mathbb{P} \Big( X_1=1 \Big| \sum_{i=1}^n X_i = t \Big) \\[6pt]
&= \frac{\mathbb{P} \Big( X_1=1 \Big) \mathbb{P} \Big( \sum_{i=2}^n X_i = t-1 \Big)}{\mathbb{P} \Big( \sum_{i=1}^n X_i = t \Big)} \\[6pt]
&= \frac{\text{Pois}(1| \theta) \cdot \text{Pois}(t-1| (n-1)\theta)}{\text{Pois}(t| n\theta)} \\[6pt]
&= \frac{t!}{(t-1)!} \cdot \frac{ \theta \exp(-\theta) \cdot ((n-1) \theta)^{t-1} \exp(-(n-1)\theta)}{(n \theta)^t \exp(-n\theta)} \\[6pt]
&= t \cdot \frac{ (n-1)^{t-1}}{n^t} \\[6pt]
&= \frac{t}{n} \Big( 1- \frac{1}{n} \Big)^{t-1} \\[6pt]
\end{aligned} \end{equation}$$
Your answer has a slight error where you have conflated the sample mean and the sample total, but most of your working is correct. As $n \rightarrow \infty$ we have $(1-\tfrac{1}{n})^n \rightarrow \exp(-1)$ and $t/n \rightarrow \theta$, so taking these asymptotic results together we can also confirm consistency of the estimator:
$$\hat{g}(\mathbf{X}) = \frac{t}{n} \Big[ \Big( 1- \frac{1}{n} \Big)^n \Big] ^{\frac{t}{n} - \frac{1}{n}} \rightarrow \theta [ \exp (-1) ]^\theta = \theta \exp (-\theta) = g(\theta).$$
This latter demonstration is heuristic, but it gives a nice check on the working. It is interesting here that you get an estimator that is a finite approximation to the exponential function of interest.
Best Answer
You could proceed from $(**)$ as follows: for $x \leq x_0$, \begin{align*} &\mathbb P(X_1 - X_{(1)} > x_0 - x)\\ &= 1-\mathbb P(X_1 - X_{(1)} \leq x_0 - x)\\ &= 1-\mathbb P(X_{(1)} \geq X_1 - x_0 + x)\\ &= 1-\mathbb P(X_i \geq X_1 - x_0 + x\text{ for all $i=1,\dots,n$})\\ &= 1-\mathbb P(X_i \geq X_1 - x_0 + x\text{ for all $i=2,\dots,n$})\\ &= 1-\int_{\theta}^\infty \mathbb P(X_i \geq t - x_0 + x\text{ for all $i=2,\dots,n$})f_{X_1}(t)\ dt\\ &= 1- \int_\theta^\infty f_{X_1}(t)\prod_{i=2}^n \mathbb P(X_i \geq t-x_0+x)\ dt \\ &= \dots \end{align*} However, there appears to be a problem which needs to be addressed first. When referring to a "complete sufficient statistic for $\theta$", we need to carefully consider what space of parameters $\Theta$ is being implicitly referenced. For the space $\Theta = [0,\infty)$, it is true that $X_{(1)}$ is a complete sufficient statistic, and in this case the identification of $\frac{n-1}ne^{X_{(1)}-x_0}$ as the UMVUE for $e^{-x_0+\theta}$ is correct (for $n>1$). However, this does not provide a UMVUE for $f_X(x_0)$ on this space $\Theta$, as the equation $f_X(x_0) = e^{-x_0+\theta}$ does not hold when $\theta>x_0$. The statistic $T(\mathbf X)$ also fails to be unbiased when $\theta>x_0$.
You may instead be intending to consider the space $\Theta = [0, x_0]$, but in this case $X_{(1)}$ is no longer complete, so you will need a different complete, sufficient statistic ($\min\{X_{(1)}, x_0\}$ should do the trick.) After this is sorted out, you should be able to proceed in a similar way as above.