Partial regression coefficient calculated in two different ways

regressionstatistics

Consider observations on three variables $X_1,X_2$ and $X_3$ : Suppose that $X_1$ is regressed on $X_2$ : When the residual of the above regression is regressed on $X_3$; the regression coefficient of $X_3$ is $\beta_3$ : When $X_1$ is regressed on $X_2$ and $X_3$ simultaneously, the regression coefficient of $X_3$ is ${\beta_3}^{*}$. Show that $|\beta_3|\le|\beta_3^{*}|$.

The expressions are simpler if we use linear regression , still I cannot establish this result.
This problem looks quite interesting.
Please feel free to share your approach!
Thanks!

Best Answer

To make things easier, I will use $X,Y,Z$ in place of $X_1,X_2,X_3$, and assume the intercept is 0 (along with some other assumptions). The ideas should extend to more general cases.

We are given regressions:

(1) $X=aY+U$, with residual $U$.

(2) $U = X-aY = bZ+V$, with residual $V$.

(3) $X = cY+dZ+W$, with residual $W$.

We'd like to show $|b|\le|d|$.

Rewrite (2) as:

(4) $X = aY+bZ+V$

Compare (4) and (3): since for a reasonably regression, $(c,d)$ should minimize $Var(W)$, we have

(5) $Var(W)\le Var(V)$.

Similarly, compare (1) and (3): since $a$ minimizes $Var(U)$, replacing $U$ with $bZ+V$ from (2), we have

(6) $Var(bZ+V)\le Var(dZ+W)$.

Since for a reasonable regression, we have $Cov(Z,V)=Cov(Z,W)=0$ (otherwise we would have correlation unaccounted for by the coefficients) we can deduce from (6):

(7) $b^2Var(Z)+Var(V) \le d^2 Var(Z)+Var(W)$

With (5) we arrive at the desired result.

Related Question