Solved – Consistency and Asymptotic Normality for MLE of Independent NON-identically distributed normals

asymptoticsconsistencylikelihood-ratiomaximum likelihood

I have the following setting:
$$x_k \sim N(\mu,\sigma^2 + \hat{\delta}^2_k),k=1,\dots,K,$$

where $\{x_k,k=1,\dots,K\}$ – observed data, $\{\hat{\delta}^2_k\,k=1,\dots,K \}$ are known parameters (just consider them fixed), while $(\mu,\sigma^2)$ are unknown. My primary goal – making inference about $\mu$. Does anyone know of any results for this particular case in terms of consistency and asymptotic behavior of MLE estimator for $\mu$ as $K\rightarrow \infty$?

For the time being I am simply using a $\chi^2$ approximation for the likelihood ratio test, but I don't have solid theoretical argument for that as the data is not identically distributed. It also complicates things that I fail to get a closed-form solution for $\hat{\mu}_{mle}$

I have tried Hoadley's paper on "Asymptotic Properties of MLE for Independent Non-Identically distributed case" http://projecteuclid.org/download/pdf_1/euclid.aoms/1177693066, but verifying their general conditions is not coming to me easy as of yet.

Please let me know if anyone encountered this particular kind of a problem and knows what regularity conditions are needed for nice classic MLE properties. Obviously there should be a certain upper bound on values $\{ \hat{\delta}^2_k,\ k=1,\dots, \}$, but what kind of bound, and even with that bound – how to prove consistency/asymptotic normality.

You can find much more accessible conditions for consistency and asymptotic normality of MLE in Hayashi's Econometrics, ch. 7.,in the general context of Extremum Estimators and its sub-class, the M-estimators. Hayashi has also references for detailed proofs on the conditions.

The MLE with independent observations belongs to this subclass, because it maximizes a "sample average", an average of a real-valued function of the data and the unknown parameters (note that with independent observations the log-likelihood of the sample is certainly a sum, and we can divide it by the sample size without affecting the solution).

So (in general notation)

$$\hat \theta_{MLE} = \text{argmax}_{\theta} \left\{\frac 1n \sum_{i=1}^n \ell_i(x_i;\theta)\right\}$$

where $\ell_i$ is the log-likelihood of observation $i$.

For consistency, there are two-three alternative sets of conditions. Common to all conditions are:
1) The parameters lie in the interior of the parameter space

2) $\ell_i(x_i;\theta)$ is measurable (if it is continuous, it is measurable)

3) The objective function $\frac 1n \sum_{i=1}^n \ell_i(x_i;\theta)$ converges in probability to some function, say $\ell_0(\theta)$

4) $\ell_0(\theta)$ is uniquely maximized at the true parameter vector (say $\theta_0$)

Then moreover:
1st Alternative : if the parameter space is compact, and convergence is uniform, we obtain consistency.

2nd Alternative : if the parameter space is not compact, then if the log-likelihood is concave and convergence is just pointwise, we again obtain consistency.

I ' ll leave asymptotic normality for the OP to look up and explore.