UMVUE for Bernoulli Distribution – How to Achieve UMVUE for Bernoulli Distributions

conditional-expectationmathematical-statisticsself-studyumvueunbiased-estimator

Let $X_1,..,X_n$ be independent and $Bin(1,\theta)$ distributed. I would like to find the UMVUE for $\phi(\theta)=\theta^3$. I have a complete and sufficient statistic in $T=\sum_iX_i$, and a unbiased estimator in $S=X_1X_2X_3$. I've tried to apply Rao–Blackwell:

$$\mathbb{E_{\theta}}(X_1X_2X_3\mid T)$$

I'm not sure how to proceed as the individual terms are not indenpendent with T.

Best Answer

First, observe that $S = 1$ only if $X_1 = 1$ and $X_2 = 1$ and $X_3 = 1$. Given that we have drawn $n$ samples and $T$ of them are equal to one, the probability that $S = 1$ is:

$$P(S=1) = \max\left(0, {T \over n}{T-1\over n-1}{T-2\over n-2}\right)$$

where the $(T-1)/(n-1)$ and $(T-2)/(n-2)$ terms come about because we are sampling without replacement from our $n$ observations; for example, once we know that $X_1 = 1$, there are only $n-1$ values left that $X_2$ might take on, and only $T-1$ of them are equal to one. If $T < 2$, the probability that $S=1$ is of course 0, not negative.

Consequently, as $\mathbb{E_{\theta}}(X_1X_2X_3\mid T) = 1\cdot P(S=1) + 0\cdot P(S=0)$, which just equals $P(S=1)$,

$$\mathbb{E_{\theta}}(X_1X_2X_3\mid T) = \max\left(0, {T \over n}{T-1\over n-1}{T-2\over n-2}\right)$$.

Let's try an experiment to see how much Rao-Blackwell has improved our estimator. We're going to use a smallish sample size, $n=10$, and set $\theta = 0.6$. The code below is horribly inefficient but, for people who may not be so familiar with R, should be pretty easy to follow:

n <- 10
theta <- 0.6

dt <- data.frame(Unbiased = rep(0,100000), 
                 Rao_Blackwell = rep(0,100000))

for (i in 1:nrow(dt)) {
   T_sample <- rbinom(n, 1, theta)
   dt$Unbiased[i] <- T_sample[1] * T_sample[2] * T_sample[3]
   nT <- sum(T_sample)
   dt$Rao_Blackwell[i] <- max(0, nT * (nT-1) * (nT-2) / (n * (n-1) * (n-2)))
}

Now for the output:

> dt <- dt - theta^3
> 
> # Check the bias; it should be tiny, due only to sampling
> colMeans(dt)
     Unbiased Rao_Blackwell 
 0.0011700000  0.0007798333 
> 
> # Compare the variances
> apply(dt, 2, var)
     Unbiased Rao_Blackwell 
   0.17000889    0.03240376 
>

Rao-Blackwellization has reduced the variance of the estimator by a factor of six, far more than the factor of 3.3 which is the ratio of the complete sample size (10) to the number of observations used to construct the initial unbiased estimator (3).