Maximum Likelihood Estimation – Maximum Likelihood Estimation (MLE) in Layman Terms

definitionintuitionmathematical-statisticsmaximum likelihoodphilosophical

Could anyone explain to me in detail about maximum likelihood estimation (MLE) in layman's terms? I would like to know the underlying concept before going into mathematical derivation or equation.

Best Answer

Say you have some data. Say you're willing to assume that the data comes from some distribution -- perhaps Gaussian. There are an infinite number of different Gaussians that the data could have come from (which correspond to the combination of the infinite number of means and variances that a Gaussian distribution can have). MLE will pick the Gaussian (i.e., the mean and variance) that is "most consistent" with your data (the precise meaning of consistent is explained below).

So, say you've got a data set of $y = \{-1, 3, 7\}$. The most consistent Gaussian from which that data could have come has a mean of 3 and a variance of 16. It could have been sampled from some other Gaussian. But one with a mean of 3 and variance of 16 is most consistent with the data in the following sense: the probability of getting the particular $y$ values you observed is greater with this choice of mean and variance, than it is with any other choice.

Moving to regression: instead of the mean being a constant, the mean is a linear function of the data, as specified by the regression equation. So, say you've got data like $x = \{ 2,4,10 \}$ along with $y$ from before. The mean of that Gaussian is now the fitted regression model $X'\hat\beta$, where $\hat\beta =[-1.9,.9]$

Moving to GLMs: replace Gaussian with some other distribution (from the exponential family). The mean is now a linear function of the data, as specified by the regression equation, transformed by the link function. So, it's $g(X'\beta)$, where $g(x) = e^x/(1+e^x)$ for logit (with binomial data).