Solved – Intuition for “weights” in simple linear regression

intuitionregressionweighted-regression

Suppose we have data $\{x_i,y_i\}_{i=1}^n$ where $x_i \in \mathbb{R}$ and $y_i \in \mathbb{R}$ and we model
$$
y_i=\beta x_i + \varepsilon_i
$$
The ordinary least squares estimate of $\beta$ is
$$
\widehat \beta = \sum_{i=1}^n w_i y_i
$$
where $w_i={x_i}/{\sum_{j=1}^nx_j^2}$ can be viewed as "weights" on each $y_i$.

I've been thinking about what these "weights" mean and why they make sense but it seems to put more weight on larger values of $x_i$, which I don't quite understand why this makes sense.

Can someone help me with the intuition for why the "weights" on $y_i$ make sense? Thanks.

Note: I'm not interested in the derivation of $\widehat \beta$ or that these weights happen to minimize the sum of least squares. I'm interested in the intuition i.e. how would you explain this to the layman without math.

Best Answer

The problem of weights in regression is a really vaste domain.

The traditional problem is to minimize $$SSQ=\sum_{i=1}^n \Big(y_i^{(calc)}-y_i^{(exp)}\Big)^2$$ and, as you know, this gives a large influence to the largest values of the $y_i^{(exp)}$. This corresponds to the sum of squares of the absolute errors of the $y$'s $(w_i=1)$.

If instead you consider $$SSQ=\sum_{i=1}^n \Big(\frac{y_i^{(calc)}-y_i^{(exp)}}{y_i^{(exp)}}\Big)^2$$ This corresponds to the sum of squares of the relative errors of the $y$'s $(w_i=\frac 1 {y_i^2})$.

But there is another situation where the weights can be important. Suppose that the model is $y=Ae^{Bx}$ which is nonlinear. You can linearize it taking logarithms $\log(y)=\alpha+\beta x$ but ordinary least squares can lead to very different results compared to nonlinear regression since the transform gives greater weights to small $y$ values. For this very specific case, it is been found that $$SSQ=\sum_{i=1}^n y_i\Big(\alpha+\beta x_i-\log(y_i)\Big)^2$$ is very acceptable.

I suggest you have a look at http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd143.htm