[Math] Showing $E(S^2\mid \bar X)=\bar X$ for i.i.d Poisson random variables $X_i$

conditional-expectationpoisson distributionprobabilityprobability distributionsstatistics

Let $X_1,X_2,\ldots,X_n$ be i.i.d $\text{P}(\lambda)$ random variables where $\lambda(>0)$ is unknown. Define $$\bar X=\frac{1}{n}\sum_{i=1}^n X_i\qquad,\qquad S^2=\frac{1}{n-1}\sum_{i=1}^n(X_i-\bar X)^2$$ as the sample mean and sample variance respectively.

Since $\sum_{i=1}^n X_i$ and hence $\bar X$ is a complete sufficient statistic for $\lambda$ such that $E(\bar X)=\lambda$, $\bar X$ is the uniformly minimum variance unbiased estimator (UMVUE) of $\lambda$ by the Lehmann-Scheffe theorem.
Again, $E(S^2)=\lambda$, so that $E(S^2\mid \bar X)$ is also the UMVUE of $\lambda$. As UMVUE is unique whenever it exists, it must be that $$E(S^2\mid \bar X)=\bar X$$

The question is:

How can I directly show that $E(S^2\mid \bar X)=\bar X$ ?

I don't see how to proceed from

\begin{align}
E(S^2\mid \bar X=t)&=E\left[\frac{1}{n-1}\sum_{i=1}^n(X_i-t)^2\mid \bar X=t\right]
\\&=E\left[\frac{1}{n-1}\left(\sum_{i=1}^nX_i^2-nt^2\right)\mid \bar X=t\right]
\\&=E\left[\frac{1}{n-1}\sum_{i=1}^nX_i^2\mid \bar X=t\right]-E\left[\frac{nt^2}{n-1}\mid \bar X=t\right]
\\&=\frac{1}{n-1}\sum_{i=1}^nE\left(X_i^2\mid \bar X=t\right)-\frac{n}{n-1}E(\bar X^2\mid \bar X=t)\tag{1}
\end{align}

Any hint would be great.


As correctly pointed out by Mike Earnest, the conditional distribution of $(X_1,X_2,\cdots,X_n)\mid \bar X$ is multinomial. That is, for a natural number $k$,

$$P\left(X_1=x_1,\cdots,X_n=x_n\mid \bar X=\frac{k}{n}\right)=\frac{k!}{x_1!\,x_2!\cdots x_n!}\left(\frac{1}{n}\right)^{x_1}\left(\frac{1}{n}\right)^{x_2}\cdots\left(\frac{1}{n}\right)^{x_n}\mathbf1_{x_i\in A}$$

, where $$A=\left\{(x_1,\cdots,x_n)\in\{0,1,\cdots,k\}: \sum_{i=1}^nx_i=k\right\}$$

From this, we have for each $i$, $$V\left(X_i\mid \bar X=t\right)=t\left(1-\frac{1}{n}\right)\qquad,\qquad E\left(X_i\mid \bar X=t\right)=t$$

And for all $i\ne j$, $$E\left(X_iX_j\mid \bar X=t\right)=t\left(t-\frac{1}{n}\right)$$

So,

\begin{align}
E\left(X_i^2\mid \bar X=t\right)&=V\left(X_i\mid \bar X=t\right)+\left[E\left(X_i\mid \bar X=t\right)\right]^2
\\&=\frac{t}{n}(n+nt-1)
\end{align}

Also, as expected,

\begin{align}
E\left(\bar X^2\mid \bar X=t\right)&=E\left[\frac{1}{n^2}\sum_{i=1}^n\sum_{j=1}^nX_iX_j\mid \bar X=t\right]
\\&=\frac{1}{n}E\left(X_1^2\mid \bar X=t\right)+\frac{1}{n^2}\sum_{i\ne j}E\left[X_iX_j\mid \bar X=t\right]
\\&=\frac{1}{n}\cdot\frac{t}{n}(n-1+nt)+\frac{2}{n^2}\binom{n}{2}t\left(t-\frac{1}{n}\right)
\\&=t^2
\end{align}

So from $(1)$ I finally get,

\begin{align}
E(S^2\mid \bar X=t)&=\frac{n}{n-1}\cdot\frac{t}{n}(n+nt-1)-\frac{n}{n-1}\cdot t^2
\\&=t
\end{align}

Hence proved.

(Thanks to Mike Earnest in particular.)

Best Answer

$\newcommand{\e}{\operatorname{E}}$ \begin{align} & \Pr(X_1=x_1 \mid \overline X = x/n) = \Pr(X_1=x_1\mid X_1+\cdots+X_n = x) \\[10pt] = {} & \frac{\Pr(X_1=x_1\ \&\ X_1+\cdots+X_n = x)}{\Pr(X_1+\cdots+X_n = x)} \\[10pt] = {} & \frac{\Pr(X_1=x_1\ \&\ X_2+\cdots+X_n = x-x_1)}{\Pr(X_1+\cdots+X_n = x)} \\[10pt] = {} & \frac{\dfrac{\lambda^{x_1} e^{-\lambda}}{(x-x_1)!} \cdot \dfrac{((n-1)\lambda)^{x-x_1} e^{-((n-1)\lambda)}} {x_1!}}{\left( \dfrac{(n\lambda)^x e^{-n\lambda}} {x!} \right)} = \binom x {x_1} \left( \frac 1 n \right)^x \left( 1 - \frac 1 n \right)^{x-x_1} \end{align} In other words, $$ X_1\mid \overline X \sim \operatorname{Binomial} \left(n\overline X, \frac 1 n \right). $$ Therefore $$ \operatorname E\left((X_1-\overline X)^2 \mid \overline X\right) = \overline X\left( 1 - \frac 1 n \right). $$ Consequently $$ \operatorname E\left( (X_1-\overline X)^2 + \cdots + (X_n-\overline X)^2 \mid \overline X\right) = (n-1)\overline X. $$

Related Question