Solved – Linear regression with shot noise

errors in variablesheteroscedasticitymaximum likelihoodregression

I'm looking for the right statistical terminology to describe the following problem.

I want to characterize an electronics device that has a linear response

$Y = \beta_0 + \beta_1 X + \epsilon$

where $\epsilon \sim N(0,\sigma^2_{ro})$ is a term due to the read-out noise of the device. In order to determine $\beta_0, \beta_1, \sigma^2_{ro}$ I would measure a series of responses $\{X_i,Y_i\}$ and apply the standard linear regression toolbox. But I don't know what the $X_i$ are exactly, because I use a source that is affected by shot noise. That is I know that if I set the dial on the source to a certain value $J_i$ then $X_i \sim N(\mu, \mu)$ (a Gaussian with average $\mu$ and variance $\mu$).

This looks like an errors-in-variables model of linear regression (http://en.wikipedia.org/wiki/Errors-in-variables_models), where not for the fact that in order to characterize my device over its entire input range, during the measurements I have to change the value of $J_i$, and now the variance of the $X_i$ is not fixed, but it depends on $X_i$ (through J_i), although because of the shot noise if $X_i=X_j$ this does not mean that the variance of $X_i$ is the same as the variance of $X_j$.

What is this model called, and are there articles where I can find out such a problem is approached? Or am I formulating in the wrong manner?

Best Answer

The probability model for such shot noise is

$$X \sim \text{Poisson}(\mu),\quad Y|X \sim \text{Normal}(\beta_0+\beta_1 X, \sigma^2).$$

A good estimate of $\mu$ is the mean of $X$ and a good estimate of $(\beta_0, \beta_1)$ is afforded by ordinary least squares, because the values of $Y$ are assumed independent, identically distributed, and normal.

The estimate of $\sigma^2$ given by OLS is inappropriate here, though, due to the randomness of $X$. The maximum likelihood estimate is

$$s^2 = \frac{S_{xy}^2 - 2 S_x S_y S_{xy} + S_{xx}\left(S_y^2 - S_{yy}\right) + S_x^2 S_{yy}}{S_x^2 - S_{xx}}.$$

In this notation, $S_x$ is the mean $X$ value, $S_{xy}$ is the mean of the products of the $X$ and $Y$ values, etc.

We can expect the standard errors of estimation in the two approaches (OLS, which is not quite right, and MLE as described here) to differ. There are various ways to obtain ML standard errors: consult a reference. Because the log likelihood is relatively simple (especially when the Poisson$(\mu)$ distribution is approximated by a Normal$(\mu,\mu)$ distribution for large $\mu$), these standard errors can be computed in closed form if one desires.


As a worked example, I generated $12$ $X$ values from a Poisson$(100)$ distribution:

94,99,106,87,91,101,90,102,93,110,97,123

Then, setting $\beta_0=3$, $\beta_1=1/2$, and $\sigma=1$, I generated $12$ corresponding $Y$ values:

47.4662,53.5622,54.6656,45.3592,49.0347,53.8803,48.3437,54.2255,48.4506,58.6761,50.7423,63.9922

The mean $X$ value equals $99.4167$, the estimate of $\mu$. The OLS results (which are identical to the MLE of the coefficients) estimate $\beta_0$ as $1.24$ and $\beta_1$ as $0.514271$. It is no surprise the estimate of the intercept, $\beta_0$, departs from its true value of $3$, because these $X$ values stay far from the origin. The estimate of the slope, $\beta_1$, is close to the true value of $0.5$.

The OLS estimate of $\sigma^2$, however, is $0.715$, less than the true value of $1$. The MLE of $\sigma^2$ works out to $0.999351$. (It is an accident that both estimates are low and that the MLE is greater than the OLS estimate.)

Figure

The line is both the OLS fit and the maximum likelihood estimate for the joint Poisson-Normal probability model.