Regression BLUE Estimator – Is OLS the Only BLUE Estimator?

blueleast squaresregression

Gauss–Markov_theorem states that OLS estimator is a BLUE estimator. My doubt is can there be any other linear estimator, other than OLS, which is also a BLUE estimator?

After going through the proof of why OLS is a BLUE estimator, I feel that only OLS estimator can be the BLUE estimator. Unbiased Linear Estimators from any other techniques should essentially yield the same result as from OLS technique for them to be BLUE.

I hope I am not making any blunders in assuming so.

Best Answer

When the conditions for linear regression are met, the OLS estimator is the only BLUE estimator. The B in BLUE stands for best, and in this context best means the unbiased estimator with the lowest variance.

If the regression conditions aren't met - for instance, if heteroskedasticity is present - then the OLS estimator is still unbiased but it is no longer best. Instead, a variation called general least squares (GLS) will be BLUE.

Related Solutions

Solved – Gauss-Markov theorem: BLUE and OLS

I am not sure if I understood you question correctly, but if you are looking to prove that the OLS for $\hat{\beta}$ is BLUE (best linear unbiased estimator) you have to prove the following two things: First that $\hat{\beta}$ is unbiased and second that $Var(\hat{\beta})$ is the smallest among all linear unbiased estimators.

Proof that OLS estimator is unbiased can be found here http://economictheoryblog.com/2015/02/19/ols_estimator/

and proof that $Var(\hat{\beta})$ is the smallest among all linear unbiased estimators can be found here http://economictheoryblog.com/2015/02/26/markov_theorem/

Regression – Exploring Unbiased Estimators Beyond BLUE (OLS Solution) for Linear Models

One example that comes to mind is some GLS estimator that weights observations differently although that is not necessary when the Gauss-Markov assumptions are met (which the statistician may not know to be the case and hence apply still apply GLS).

Consider the case of a regression of $y_i$, $i=1,\ldots,n$ on a constant for illustration (readily generalizes to general GLS estimators). Here, $\{y_i\}$ is assumed to be a random sample from a population with mean $\mu$ and variance $\sigma^2$.

Then, we know that OLS is just $\hat\beta=\bar y$, the sample mean. To emphasize the point that each observation is weighted with weight $1/n$, write this as $$ \hat\beta=\sum_{i=1}^n\frac{1}{n}y_i. $$ It is well-known that $Var(\hat\beta)=\sigma^2/n$.

Now, consider another estimator which can be written as $$ \tilde\beta=\sum_{i=1}^nw_iy_i, $$ where the weights are such that $\sum_iw_i=1$. This ensures that the estimator is unbiased, as $$ E\left(\sum_{i=1}^nw_iy_i\right)=\sum_{i=1}^nw_iE(y_i)=\sum_{i=1}^nw_i\mu=\mu. $$ Its variance will exceed that of OLS unless $w_i=1/n$ for all $i$ (in which case it will of course reduce to OLS), which can for instance be shown via a Lagrangian:

\begin{align*} L&=V(\tilde\beta)-\lambda\left(\sum_iw_i-1\right)\\ &=\sum_iw_i^2\sigma^2-\lambda\left(\sum_iw_i-1\right), \end{align*} with partial derivatives w.r.t. $w_i$ set to zero being equal to $2\sigma^2w_i-\lambda=0$ for all $i$, and $\partial L/\partial\lambda=0$ equaling $\sum_iw_i-1=0$. Solving the first set of derivatives for $\lambda$ and equating them yields $w_i=w_j$, which implies $w_i=1/n$ minimizes the variance, by the requirement that the weights sum to one.

Here is a graphical illustration from a little simulation, created with the code below:

EDIT: In response to @kjetilbhalvorsen's and @RichardHardy's suggestions I also include the median of the $y_i$, the MLE of the location parameter pf a t(4) distribution (I get warnings that In log(s) : NaNs produced that I did not check further) and Huber's estimator in the plot.

We observe that all estimators seem to be unbiased. However, the estimator that uses weights $w_i=(1\pm\epsilon)/n$ as weights for either half of the sample is more variable, as are the median, the MLE of the t-distribution and Huber's estimator (the latter only slightly so, see also here).

That the latter three are outperformed by the OLS solution is not immediately implied by the BLUE property (at least not to me), as it is not obvious if they are linear estimators (nor do I know if the MLE and Huber are unbiased).

library(MASS)
n <- 100      
reps <- 1e6

epsilon <- 0.5
w <- c(rep((1+epsilon)/n,n/2),rep((1-epsilon)/n,n/2))

ols <- weightedestimator <- lad <- mle.t4 <- huberest <- rep(NA,reps)

for (i in 1:reps)
{
  y <- rnorm(n)
  ols[i] <- mean(y)
  weightedestimator[i] <- crossprod(w,y)  
  lad[i] <- median(y)   
  mle.t4[i] <- fitdistr(y, "t", df=4)$estimate[1]
  huberest[i] <- huber(y)$mu
}

plot(density(ols), col="purple", lwd=3, main="Kernel-estimate of density of OLS and other estimators",xlab="")
lines(density(weightedestimator), col="lightblue2", lwd=3)     
lines(density(lad), col="salmon", lwd=3)     
lines(density(mle.t4), col="green", lwd=3)
lines(density(huberest), col="#949413", lwd=3)
abline(v=0,lty=2)
legend('topright', c("OLS","weighted","median", "MLE t, 4 df", "Huber"), col=c("purple","lightblue","salmon","green", "#949413"), lwd=3)

Best Answer

Related Solutions

Solved – Gauss-Markov theorem: BLUE and OLS

Regression – Exploring Unbiased Estimators Beyond BLUE (OLS Solution) for Linear Models

Related Question