Regression on trivariate data with one coefficient 0

linear regressionregression analysissolution-verificationstatistics

Suppose {$(x_i,y_i,z_i):i=1,2,…,n$} is a set of trivariate observations on three variables:$X,Y,Z$, where $z_i=0$ for $i=1,2,…,n-1$ and $z_n=1$.Suppose the least squares linear regression equation of $Y$ on $X$ based on the first $n-1$ observations is $y=\hat{\alpha_0}+\hat{\alpha_1}x$ and the least squares linear regression equation of $Y$ on $X$ and $Z$ based on all the $n$ observations is $y=\hat{\beta_0}+\hat{\beta_1}x+\hat{\beta_2}z$.

We need to show that $\hat{\alpha_1}=\hat{\beta_1}$.

My approach:

Based on the first $n-1$ observations, as $z_i=0$, so, we consider a typical linear regression model of $Y$ on $X$.

Thus,the least square estimate $\hat{\alpha_1}=\frac{\sum_{i=1}^{n-1} (x_i-\bar{x})(y_i-\bar{y})}{\sum_{i=1}^{n-1} (x_i-\bar{x})^2}$

And in the second case, we have:

$y_1=\beta_0+\beta_1 x_1+e_1$

$y_2=\beta_0+\beta_1 x_2+e_2$

.

.

.

$y_n=\beta_{0}+\beta_1 x_n+\beta_2+e_n$

Thus, the error sum of squares:

$=\sum_{i=1}^{n-1} (y_i-\beta_0-\beta_1 x_i)^2+(y_n-\beta_1 x_n -\beta_0 -\beta_2)^2$

Differentiating this w.r.t. $\beta_0,\beta_1,\beta_2$ and equating them to $0$, we get the same value of the estimate $\hat{\beta_1}$, as the normal equations for $\beta_0,\beta_1$ come out to be the same by plugging in $\hat{\beta_2}=y_n-\hat{\beta_1}x_n-\hat{\beta_0}$.

So, is my approach correct?
Or can you guys see a major flaw?
Let me know

Best Answer

Your approach is correct.

By differentiating with respect to $\beta_2$, we can see that at the optimal value, we must have

$$\hat{\beta}_2 = y_n -\hat{\beta_1}x_n-\hat{\beta_0}$$

That is the last term of the objective function must vanish.

Hence the problem to solve for $\hat{\beta_0}$ and $\hat{\beta_1}$ is the same as minimizing

$$\sum_{i=1}^{n-1} (y_i-\beta_0-\beta_1 x_i)^2$$

Hence, we know that $\hat{\beta_1}=\hat{\alpha_1}$ and furthermore, $\hat{\beta_0}=\hat{\alpha_0}$.

Related Question