Solved – Linear regression: any non-normal distribution giving identity of OLS and MLE

least squaresmathematical-statisticsmaximum likelihoodnormal distributionregression

This question is inspired from the long discussion in comments here: How does linear regression use the normal distribution?

In the usual linear regression model, for simplicity here written with only one predictor:
$$
Y_i = \beta_0 + \beta_1 x_i + \epsilon_i
$$
where the $x_i$ are known constants and $\epsilon_i$ are zero-mean independent error terms. If we in addition assume normal distributions for the errors, then the usual least squares estimators and the maximum likelihood estimators of $\beta_0, \beta_1$ are identical.

So my easy question: do there exist any other distribution for the error terms such that the mle are identical with the ordinary least squaeres estimator? The one implication is easy to show, the other one not so.

Best Answer

In maximum likelihood estimation, we calculate

$$\hat \beta_{ML}: \sum \frac {\partial \ln f(\epsilon_i)}{\partial \beta} = \mathbf 0 \implies \sum \frac {f'(\epsilon_i)}{f(\epsilon_i)}\mathbf x_i = \mathbf 0$$

the last relation taking into account the linearity structure of the regression equation.

In comparison , the OLS estimator satisfies

$$\sum \epsilon_i\mathbf x_i = \mathbf 0$$

In order to obtain identical algebraic expressions for the slope coefficients we need to have a density for the error term such that

$$\frac {f'(\epsilon_i)}{f(\epsilon_i)} = \pm \;c\epsilon_i \implies f'(\epsilon_i)= \pm \;c\epsilon_if(\epsilon_i)$$

These are differential equations of the form $y' = \pm\; xy$ that have solutions

$$\int \frac 1 {y}dy = \pm \int x dx\implies \ln y = \pm\;\frac 12 x^2$$

$$ \implies y = f(\epsilon) = \exp\left \{\pm\;\frac 12 c\epsilon^2\right\}$$

Any function that has this kernel and integrates to unity over an appropriate domain, will make the MLE and OLS for the slope coefficients identical. Namely we are looking for

$$g(x)= A\exp\left \{\pm\;\frac 12 cx^2\right\} : \int_a^b g(x)dx =1$$

Is there such a $g$ that is not the normal density (or the half-normal or the derivative of the error function)?

Certainly. But one more thing one has to consider is the following: if one uses the plus sign in the exponent, and a symmetric support around zero for example, one will get a density that has a unique minimum in the middle, and two local maxima at the boundaries of the support.

Related Question