Confusion about Fisher information and Cramer-Rao lower bound.

fisher informationlog likelihoodmaximum likelihoodprobabilitystatistics

I recently learned about Fisher information and the Cramér-Rao lower bound, but there is something that is bothering me. Take for example a poission distribution, the likelihood function is defined as,

$$\mathcal{L}(\theta;\pmb{Y}=y)=\exp\{-n\theta\}\dfrac{\theta^{\sum_{i=1}^{n}y_i}}{\prod_{i=1}^{n}y_i!}$$
The log-likelihood $\ell(\theta)$ is
$$\ell(\theta)=-n\theta+\left(\sum_{i=1}^{n}y_i\right)\log\theta+\log\left(\prod_{i=1}^{n}y_i!\right)$$
Now,
$$\dfrac{\partial\ell}{\partial\theta}=-n+\dfrac{\sum_{i=1}^{n}y_i}{\theta}\Longrightarrow\dfrac{\partial\ell}{\partial\theta}=0\Longrightarrow \hat{\theta}=\dfrac{\sum_{i=1}^{n}y_i}{n}\text{( $\hat{\theta}$ is our MLE)}$$
Furthermore,
$$\dfrac{\partial^2\ell}{\partial\theta^2}=\dfrac{-\sum_{i=1}^{n}y_i}{\theta^2}$$
On the Fisher information wikipedia page, Fisher information is defined as
$$I(\theta)=-\mathbb{E}\left[\dfrac{\partial^2}{\partial\theta^2}\ell(\theta)\right]$$
But on the Cramér-Rao lower bound page, we have the fisher information defined as
$$I(\theta)=-n\mathbb{E}\left[\dfrac{\partial^2}{\partial\theta^2}\ell(\theta)\right]$$

Now with respect to the Poisson example up above, In the first case we have that the Fisher information is
$$-\mathbb{E}\left[\frac{-\sum_{i}^{n}y_i}{\theta^2}\right]=-\dfrac{(-n\theta)}{\theta^2}=\dfrac{n}{\theta}$$

So what I believed to be the Cramér-Rao lower bound is $\mathrm{var}(\hat{\theta})\geq\dfrac{1}{I(\theta)}\Rightarrow\mathrm{var}(\hat{\theta})\geq\dfrac{1}{n/\theta}=\dfrac{\theta}{n}$, which seems to be the correct CRLB, but in the second case I would get $\dfrac{\theta}{n^2}$? So I am not sure which method is correct, or if I am misunderstanding the meaning of the $n$ in the Fisher information.

Best Answer

The difference between the formulas is that the page on Fisher information is defining the information for a single outcome of $X$ while the page on CRLB is reporting the information for a sample of $n$ observations. In both cases $\ell$ represents the likelihood for a single outcome, so the final formula for $I(\theta)$ will turn out the same.

In your context the Fisher information for a sample of $n$ Poisson observations will be $n/\theta$, not $n^2/\theta$.

Related Question