[Math] Cramer-Rao and Efficient Estimators

order-statisticsprobabilityprobability theorystatistics

Let $X_1,X_2,X_3,…,X_n$ be a random sample from the exponential distribution having PDF $f(x;\lambda)= \frac{1}{\lambda}e^\frac{-x}{\lambda}\chi\{x>0\}.$

A) Find the Cramer-Rao lower bound for the variance of unbiased estimators for $\theta = \lambda^2$,

B)Determine k so that $W=k\sum\limits_{i=1}^n X^2_i$ is unbiased for $\theta$. Is W an efficient estimator for $\theta?$ (Recall:$E(X^2_i)= 2\lambda^2$ and $E(X^4_i)=24\lambda^4.)$

This is a homework problem, The notes in class and book only cover C.R.L.B for $f_Y(x;\lambda)$ and comparing it to variance of a given estimator which I understand. I don't see the connection though for using it on the variance of unbiased estimators. Is it just different terminology for the same thing or what is it that i'm missing? Can some one help me get started in the right direction?

Every online source seems to reference Fisher information, haven't cover Fisher yet.

Best Answer

Define $$ I(\lambda)\equiv E\left(\left[\frac{\partial\log l(\boldsymbol{X};\lambda)}{\partial\lambda}\right]^2\right) $$ where $l(\boldsymbol{X};\lambda)$ denotes the joint likelihood: $$ l(\boldsymbol{X};\lambda)=\prod_i\frac{1}{\lambda}\exp(-X_i/\lambda)=\frac{1}{\lambda^n}\exp\left(-\sum_iX_i/\lambda\right)\implies\log l(\boldsymbol{X};\lambda)=-\frac{S_n}{\lambda}-n\log(\lambda). $$ Here $S_n\equiv\sum_iX_i$. With this, you can compute: $$ \left[\frac{\partial\log l(\boldsymbol{X};\lambda)}{\partial\lambda}\right]^2=\left(\frac{S_n}{\lambda^2}-\frac{n}{\lambda}\right)^2=\frac{1}{\lambda^4}(S_n^2-2\lambda n S_n+\lambda^2n^2). $$ Because of independent sampling, $$ E\left[\left(\sum_iX_i\right)^2\right]=n E(X_i^2)+(n^2-n)E(X_i)^2=n(2\lambda^2)+(n^2-n)\lambda^2=(n^2+n)\lambda^2,\\ E\left(\sum_iX_i\right)=n E(X_i)=n\lambda. $$ It follows that $$ E\left(\left[\frac{\partial\log l(\boldsymbol{X};\lambda)}{\partial\lambda}\right]^2\right)=\frac{1}{\lambda^2}(n^2+n-2n^2+n^2)=\frac{n}{\lambda^2}\cdot $$ This $I$ that we have computed is called the Fisher information for $\lambda$ for the joint likelihood $l(\boldsymbol{x};\lambda)$. Now the Cramer-Rao lower bound (aka the Frechet-Darmois-Cramer-Rao lower bound) for estimating $g(\lambda)=\lambda^2$ is given by: $$ \frac{[g'(\lambda)]^2}{I(\lambda)}=\boxed{\frac{4\lambda^4}{n}}. $$ This completes (a). For (b), note that $E(X_i^2)=2\lambda^2$ so $k=\frac{1}{2n}$ makes $W=k\sum_iX_i^2$ unbiased for $\theta=\lambda^2$. We compute: $$ E(W^2)=k^2E\left[\left(\sum_iX_i^2\right)^2\right]=k^2(nE[X_i^4]+(n^2-n)E(X_i^2)^2)=\lambda^4\left(1+\frac{5}{n}\right)\cdot $$ This implies $$ \text{Var}(W)=E(W^2)-E(W)^2=\frac{5\lambda^4}{n}>\frac{4\lambda^4}{n}. $$ The last inequality means $W$ is inefficient for $\lambda^2$.


Simplifications:

  • (a) The Fisher information for $\lambda$ for the joint likelihood is $n$ times the Fisher information for $\lambda$ for the individual likelihood. The latter is easier to compute.
  • (b) In fact, a general result implies that only affine transformations of $\lambda$ can be estimated efficiently. Because $\lambda\mapsto\lambda^2$ is not affine, you can conclude without any computation that $W$ with $k=1/(2n)$ is unbiased but inefficient for $\lambda^2$.
Related Question