In maximum likelihood estimation, we calculate
$$\hat \beta_{ML}: \sum \frac {\partial \ln f(\epsilon_i)}{\partial \beta} = \mathbf 0 \implies \sum \frac {f'(\epsilon_i)}{f(\epsilon_i)}\mathbf x_i = \mathbf 0$$
the last relation taking into account the linearity structure of the regression equation.
In comparison , the OLS estimator satisfies
$$\sum \epsilon_i\mathbf x_i = \mathbf 0$$
In order to obtain identical algebraic expressions for the slope coefficients we need to have a density for the error term such that
$$\frac {f'(\epsilon_i)}{f(\epsilon_i)} = \pm \;c\epsilon_i \implies f'(\epsilon_i)= \pm \;c\epsilon_if(\epsilon_i)$$
These are differential equations of the form $y' = \pm\; xy$ that have solutions
$$\int \frac 1 {y}dy = \pm \int x dx\implies \ln y = \pm\;\frac 12 x^2$$
$$ \implies y = f(\epsilon) = \exp\left \{\pm\;\frac 12 c\epsilon^2\right\}$$
Any function that has this kernel and integrates to unity over an appropriate domain, will make the MLE and OLS for the slope coefficients identical. Namely we are looking for
$$g(x)= A\exp\left \{\pm\;\frac 12 cx^2\right\} : \int_a^b g(x)dx =1$$
Is there such a $g$ that is not the normal density (or the half-normal or the derivative of the error function)?
Certainly. But one more thing one has to consider is the following: if one uses the plus sign in the exponent, and a symmetric support around zero for example, one will get a density that has a unique minimum in the middle, and two local maxima at the boundaries of the support.
In practice the difference is huge. The exogenous assumption that you refer to requires that the errors are not correlated with regressors. If they're correlated then you can't rely on the regressions with stochastic regressors.
For instance, in observational studies, such as pretty much all economics, you do not control the regressors. You can not set US GDP to a desired level, you can only observe it. Hence, in the model where GDP is a regressor, you want errors to be independent of GDP, because in this model you can only assume stochastic regressors.
When your errors are correlated with regressors you get endogeneity issue. There are ways to handle it, such as using lagged regressors or instrumental variables.
In econometrics a textbook example is the impact of the exogenous price on the demand. We're talking about typical demand-supply equations. Here, the problem is that the prices also depend on the supply. Hence, there is an endogeneity issue, which any econometrician will promptly point out. This is to answer your question on feasibility of testing the assumption.
Once you figured that endogeneity is here, you may look for a so called instrumental variable. These are regressors which are correlated with the price but not with demand, i.e. something that may impact the supply, for instance. If the demand is for oranges, then maybe a temperature in Florida in Spring would be a suitable instrument, because it's going to impact supply of oranges - and price - but not the demand. So, you plug this instrument into the regression and tease out the impact of the price on demand
Best Answer
You can look at the assumption as if each $\epsilon_i$ comes from an independent Gaussian density with mean zero and variance $\sigma_i^2 > 0$. That is $\epsilon_i \sim N(0,\sigma_i^2)$, for $i = 1,\dots,n$.
But you can also look at it as if the random vector $\epsilon = [\epsilon_1 \dots \epsilon_n]^T$ is distributed according to a multivariate Gaussian distribution which has for mean the zero vector and for covariance matrix a diagonal matrix D where the diagonal elements of D are $\sigma_1^2, \sigma_2^2, \dots \sigma_n^2$. That is $\epsilon \sim MN(0,D)$.