Rao-Blackwell and Cramer-Rao LB comparison

parameter estimationprobability distributionsstatisticsvariance

Let $X_1, X_2, \dots, X_n$ be a random sample following the Geometric distribution.

$$
\prod\limits_{i=1}^{n} f(x_i|p) = (1-p)^{\sum\limits_{i=1}^n x_i-n}p^n
$$

Since the pmf of the Geometric distribution is exponential family, the factorization theorem yields that the statistic
$$
T = \sum\limits_{i=1}^n x_i
$$

is sufficient and complete. Then,
$$
E[T] = E\left[ \sum\limits_{i=1}^n x_i \right] = \sum\limits_{i=1}^n E[x_i] = \frac{n}{p}
$$

Therefore, according to Rao-Blackwell, the Minimum Variance Unbiased Estimator of $\frac{1}{p}$ is
$$
W = \frac{1}{n} \sum\limits_{i=1}^n x_i
$$

Now, Cramer-Rao's Lower Bound:

$$
LB = \frac{\left[\left(\frac{1}{p}\right)'\right]^2}{nI(p)} = \frac{1}{n}\frac{1-p}{p}
$$

Question: Is $V[W] = LB$ in this specific example? If so, is there a reason why $W$ has the lowest possible variance that has something to do with the geometric distribution?

Best Answer

Pmf of the sample $X_1,X_2,\ldots,X_n$ is

$$f_p(x_1,\ldots,x_n)=p^n (1-p)^{\sum_{i=1}^n x_i-n}\,\mathbf1_{x_1,\ldots,x_n\in\{1,2,\ldots\}}$$

Therefore, we have

\begin{align} \frac{\partial}{\partial p}\ln f_p(x_1,\ldots,x_n)&=\frac{n}{p}-\frac{1}{1-p}\left(\sum_{i=1}^n x_i-n\right) \\&=\frac{n}{p(1-p)}-\frac{1}{1-p}\sum_{i=1}^n x_i \\&=-\frac{n}{1-p}\left(\frac{1}{n}\sum_{i=1}^n x_i-\frac{1}{p}\right) \end{align}

Thus the score function can be expressed in the form $$\frac{\partial}{\partial p}\ln f_p(x_1,\ldots,x_n)=k(p)\left(T(x_1,\ldots,x_n)-g(p)\right)\tag{*}$$

, for some statistic $T$ and some parametric function $g(p)$. The condition $(*)$ is the equality condition of the Cramer-Rao inequality, which directly shows here that variance of $T=\frac{1}{n}\sum\limits_{i=1}^n X_i$ attains the Cramer-Rao lower bound for $g(p)=1/p$. And since $T$ is unbiased for $1/p$, this also proves that $T$ is the uniformly minimum variance unbiased estimator of $1/p$.

This fact is more to do with properties of the canonical exponential family than with geometric distribution in particular. Here is another example.