I wish to find the variance of the ML estimators for a linear model for data which is distributed according to a Poisson distribution.
$$p_i(O_i;E) = \frac{E^{O_i} \exp(-E)}{O_i!}$$
I've found how to construct the MLE and find its variance[1] when I only wish to find its expected value from $N$ observations $O_1, \dots, O_N$:
$$\hat{E} = \frac 1 N \sum_{i=1}^N O_i $$
$$\mathrm{var}(\hat{E}) = \left. \left[ -\frac{\partial^2 \ln L}{\partial E^2} \right]^{-1} \right |_{E=\hat{E}} = \frac{\hat{E}^2}{\sum_{i=1}^N O_i} $$
I would like to generalise this to where the expected value is now a linear function.
$$P_i(O_i;a_1, a_2;x_i) = \frac{(a_1+a_2x_i)^{O_i} \exp(-(a_1+a_2x_i))}{O_i!}$$
where $x$ is some independent variable.
I can find $\frac{\partial L}{\partial a_1}=0$, and $\frac{\partial L}{\partial a_2}=0$, and see that I must numerically solve
$$N=\sum_{i=1}^N\frac{O_i}{a_1+a_2x_i} $$
and
$$\sum_{i=1}^Nx_i = \sum_{i=1}^N\frac{O_ix_i}{a_1+a_2x_i} $$
to find $a_1$ and $a_2$.
How can I now find the variance of $a_1$ and $a_2$?
In $\S$7.3 of [1], Martin states that the variance matrix, $\mathbf{V}$, for a multivariate distribution is the inverse of the $\mathbf{M}$ matrix where
$$M_{ij} = -N \mathrm{E}\left[ \frac{\partial^2 \ln P(O; \theta_1, \theta_2,\dots,\theta_p)}{\partial \theta_i \, \partial \theta_j} \right] $$
where $\mathrm{E}[\dots]$ is the expectation value and $N$ is the number of observations.
If I use this definition, I get a whole lot of $x$'s which I don't know what to do with.
I've also just found [2; eqn 47], in which the author also says that the variance matrix, $\mathbf{V}$, for a multivariate distribution is the inverse of the $\mathbf{M}$ matrix, except this time, where
$$M_{ij} = -\frac{\partial^2 \ln L}{\partial \theta_i\,\partial\theta_j}$$
Which is right? Is there another method?
[1] Martin (2012), Statistics for Physical Scientists: An Introduction, $\S$7.2
Best Answer
Deadlock broken by Barlow[1].
The matrix of the variances of the ML estimators is given as the inverse of the M matrix, where
$$M_{ij} = \left. -\frac{\partial^2 \ln L}{\partial a_i \partial a_j}\right|_{a=\hat{a}}$$
It all seems self-consistent, and is valid in the limit of large $N$.
[1] R.J. Barlow (1989) "Statistics: A Guide to the Use if Statistical Methods in the Physical Sciences", $\S$5.3.4