Solved – Maximum likelihood estimation under heteroskedasticity (and relation to OLS)

dglmgamlssheteroscedasticityleast squaresmaximum likelihood

I have a question about MLE and how it relates to OLS. I know how to relate OLS and MLE when the noise is normal and homoskedastic. I can apply the same reason for heteroskedastic noise. My question is that, clearly, the noise terms are no longer identical (though still independent). So, we can apply MLE even when the distributions are not identical?

To make it more clear, if I take two samples, can I apply MLE when one sample's noise is normal, and the other's is Laplace? If not, why does MLE work for heteroskedastic noise?

Is it because, when estimating the parameter vector, we can simply take the different variances as known scaling factors* uniform variance $\sigma^2$, giving rise to the weights?

Best Answer

There seems to be multiple questions here. To use maximum likelihood estimation (mle) you need to be able to write down the likelihood function, that is, the joint density (maybe joint probability mass function) of the observations. There is no need for there to be identical distributions, you see that by using likelihood methods for regression.

But, in practice, likelihood methods are most useful when the distribution of the data can be written using a relatively small number of parameters which are common for all the observations. So, a unique variance for each observation cannot be expected to work---that would lead to more parameters than observations. But, still, there is no need to assume the same variance for all the observations. What you need is some way to describe how the variance varies, maybe as a function of the mean, maybe as a function of some known covariate.

You could even, in principle, in a regression model, assume normal errors for some of the observations and Laplace errors for others, there is no obstacle in principle. But it is difficult to think of a situation where such would be a natural way to model!

If you are using R, there are some packages that allows for separate modeling of expectation and variance, among them, dglm and gamlss. See for instance Simulate linear regression with heteroscedasticity and Is it possible to calculate variable confidence intervals, conditional on $\hat{Y}$ to address heteroscedasticity?