Solved – Weighted Least squares, why not use $\frac{1}{e_i^2}$ as weights

heteroscedasticityregressionweighted-regression

In Applied Linear Statistical Models (Kutner et al) an example on weighted least squares is given.

Dataset: Age versus Diastolic Blood pressure

Regressing DBP on age results in the following model
$$\hat Y = 56.157 + 0.58003 X$$

But a residual plots show the megaphone-alike shape which would imply heteroscedacity.

Residual plot

Since the absolute values of the residuals show a linear trend versus the age predictor OLS was performed on these residuals versus age, which resulted in
$$\hat s = -1.54946 + 0.198172 X$$

Using these fitted values as weights $$w_i = \dfrac{1}{\hat s_i^2}$$ resulted in a new model:
$$\hat Y = 55.566 + 0.59634 X$$.

Some questions remain, first denote the model as:

$$Y_i = \beta_0 + \beta_1 X_i + \epsilon_i$$

Why not use the residuals as estimator of the weights?

Since $w_i$ is defined as $\dfrac{1}{\sigma_i^2}$ where $\epsilon_i \stackrel{\text{d}}{=} N(0,\sigma^2_i)$ and
$$\text{Var}(\epsilon_i) = \sigma^2_i = E(\epsilon_i^2) – E(\epsilon_i)^2 = E(\epsilon_i^2) $$
Now since $e_i = (\hat Y_i – Y_i)$ is an estimator of $\epsilon_i$ why not use $e_i^2$ as an estimator of $\sigma^2_i$? Which would imply $w_i = \dfrac{1}{e^2_i}$.

I don't really understand why we must regress the residuals once more to find the weights.

How much does Weighted Least Squares fix?

I performed the analysis above in R and plotted a new residual plot of the weighted least squares fit, which results in:
Second residual plot

This doesn't seem much better though?

R-code

colnames(Blood_Pressure_Example) <- c("Age", "DBP");
attach(Blood_Pressure_Example)

plot(DBP~Age)
fit <- lm(DBP~Age)
abline(fit)
summary(fit)

plot(residuals.lm(fit)~Age)
plot(abs(residuals.lm(fit))~Age)

## Like in the book
fit.res <- lm(abs(residuals.lm(fit))~Age)
wii <- 1/predict.lm(fit.res)^2

fit.end <- lm(DBP~Age, weights = as.vector(wii))
summary(fit.end)
plot(residuals.lm(fit.end)~Age)

## why not use the residuals as weights?
residuals.lm(fit)
wii.test <- 1/(residuals.lm(fit)^2)
wii.test

fit.test <- lm(DBP~Age, weights=as.vector(wii.test))
summary(fit.test)
plot(residuals.lm(fit.test)~Age)  

Best Answer

It would be too noisy to estimate weights as squared residuals. Consider this: you're estimating n weights using n observations. It's one observation per parameter. It's like estimating the variance of the population having a sample size one.

So, instead you observe that the variance seems to increase linearly with X, then you regress on X. You end up estimating just two coefficients from n observations. You get much more reliable estimate of the weight this way.

On the residuals looking the same after weighted least squares: that's not surprising given that the estimated weights are not that different.

Related Question