Solved – How do we derive the OLS estimate of the variance

least squaresvariance

If we have a linear regression equation $y=X\beta + u$, then we can find the OLS estimate of $\beta$ by minimizing wrt $\hat \beta$: $E(\hat u)=E(y-X\hat\beta )$

However, my textbook suddenly says, out of nowhere, that the OLS estimate of the variance of $u$ (each $u_i$ is iid). $\sigma ^2$ is $\hat \sigma ^2 = \frac {\hat u^T \hat u}{n-K}$, where $n $ is the sample size and $K$ is the amount of independent variables.

I understand that this estimator is unbiased, but I have absolutely no idea how it is derived from the assumption of OLS, or why it is called the OLS estimate of $\sigma$.

How do we derive this estimator?

Best Answer

The estimator for the variance commonly used in regression does not come from the least squares principle, which only produces an estimate for $\boldsymbol{\beta}$. It is just a bias-corrected version (by the factor $\frac{n}{n-K})$ of the empirical variance

$$\widehat{\sigma}^2 = \frac{1}{n} \sum_{i=1}^n \left(y_i - \mathbf{x}_i^{T} \widehat{\boldsymbol{\beta}} \right)^2$$ which in turn is the maximum likelihood estimator for $\sigma^2$ under the assumption of a normal distribution. It's confusing that many people claim that that is the OLS estimator of the variance.