Maximum Likelihood – Estimation for Mixed Poisson and Gaussian Data

maximum likelihoodnormal distributionpoisson distribution

Background

I've been doing a little bit of work lately on maximum likelihood estimation (MLE), for cases where the data is normally-distributed and also for cases where the data is Poisson distributed. For these two cases, the likelihood $L$ is given by:

$$
L\left ( \mathbf{a} \right ) = \prod_{i} \mathrm{P}\left ( c_{i};m_{\mathbf{a}}\left ( x_{i} \right ) \right )
$$

Following the notation and steps of the first reference below, $\mathrm{P}\left ( c_{i};m_{\mathbf{a}}\left ( x_{i} \right ) \right )$ is the probability that a measurement gives $c_{i}$ if the true value is given by the model $m_{\mathbf{a}}\left ( x_{i} \right )$, where $\mathbf{a}$ is the set of parameters for the model.

For the two different distributions, the reference gives:

$$
L_{G}\left ( \mathbf{a} \right ) = \prod_{i} \frac{1}{\sqrt{2\pi \sigma_{i} ^{2}}} \mathrm{e}^{-\frac{\left ( c_{i}-m_{\mathbf{a}}\left ( x_{i} \right ) \right )^{2}}{2\sigma_{i} ^{2}}}
$$

for Gaussian distributed data, and

$$
L_{P}\left ( \mathbf{a} \right ) = \prod_{i} \frac{\left [m_{\mathbf{a}}\left ( x_{i} \right )\right ]^{c_{i}}}{c_{i}!}\mathrm{e}^{-m_{\mathbf{a}}\left ( x_{i} \right )}
$$

for Poisson-distributed data.

I'm happy following the subsequent steps to get a likelihood ratio test out. The steps involve calculating the negative-log of the likelihood, then finding a ratio based on maximising $L$ for a given model with respect to the global maximum of the likelihood $L$, i.e.

$$
\frac{\mathrm{max}_{\mathbf{a}}\;L(\mathbf{c}|m_{\mathbf{a}})}{\mathrm{max}\;L(\mathbf{c}|\mathbf{m})}
$$

Edit

This question is related: What is distribution of $Z = X + Y$ where $X$ is Poisson distributed and $Y$ is normally distributed?

Question

I'm comfortable with the MLE for Gaussian and Poisson data. The scenario I'm interested in is for mixed Poisson-Gaussian data. An example might be photons arriving at a detector, which will exhibit Poisson noise, and the signal from the detector then subsequently being corrupted by Gaussian noise (e.g. thermal noise in the electronics). The noise model is then a mix of Poisson and Gaussian noise.

What I want to do is test whether a Poisson, Gaussian or mixed model is most appropriate for various parameters (I don't necessarily know whether the Poisson data can be approximated by a Gaussian by the way, hence the question – it's what I'm trying to test!).

What would an appropriate likelihood function $L$ be? Is it just a sum of the Poisson and Gaussian likelihoods? If so, the log-likelihood will be a bit tricky to simplify, I believe?

I'm sure I'm missing something with this problem, I just haven't been able to make the leap from the two cases described above (I should add that it's been a bit of time since I did statistics in any serious depth). Any help much appreciated!

References

For this problem, I've been following the paper "Comparison of maximum likelihood estimation and chi-square statistics applied to counting experiments" by T. Hauschild and M. Jentschel (2001).

Link and
DOI: 10.1016/S0168-9002(00)00756-7

Best Answer

I think it is not a mixture distribution, because you actually have a sum of two noises, not a sum of two weighted probability densities.

For the sum of the two noises, you would have to assume that they are independent (while they are probably more likely positively correlated).

Then you can model the joint distribution of $Z=X+Y$ as:

$f(z)=\int_{-\infty}^{\infty}f(z-y)p(y)dy$

where f and p as densities for the Normal and Poisson respectively. I suspect this has only numerical solution, and you may then numerically maximize the Likelihood function of this density:

$\max_\theta \sum_{i=1}^{n} f(z_i|\theta)$

(also see pg.7/20 http://www.dartmouth.edu/~chance/teaching_aids/books_articles/probability_book/Chapter7.pdf)

In my opinion you may aswell fit each noise separately as you assume them independent anyways.

You may aswell assume a mixture distribution though and fit it to the distribution (you may assume any density by the way and fit it), then you would need to optimize or choose the weights aswell:

$f(z=x+y)=w_1f(x)+w_2p(y)$, with $w_1+w_2=1$.

Related Question