Solved – Why is SSE smaller for a “full” multi-regression model than for a “reduced” multi-regression model

multiple regressionregression

We're learning about multi regression in the current module of my statistics course, and the instructor noted that the sum of square errors (SSE) of a full model such as the one below:

$Y_i=\beta_0+\beta_1x_{1i}+\beta_2x_{2i}+\beta_3x_{3i}+\epsilon_i$

is going to be smaller than the SSE for any reduced model, such as the one below (which we obtain under the assumption that $\beta_1=0$):

$Y_i=\beta_0+\beta_2x_{2i}+\beta_3x_{3i}+\epsilon_i$

I'm having trouble understanding why this is true. If SSE is defined as:

$\sum^{n}_{i=1}(y_i-\hat{y_i})$

Shouldn't the full model's SSE be bigger because it has more terms?

Best Answer

If $\beta_1$ is exactly zero, the SSE for full and reduced models will be identical. To the extent $\beta_1$ is not exactly zero, the component of the variance (sums of squares) of Y attributable to $x_1$ is added to the SSE, all else being equal, as it is no longer represented in the model anywhere but e (the residual variance).

So the SSE for a reduced model can never be smaller than the SSE for a full model, because it's the SSE for the full model plus any SS attributable to the constraints to the extent they are something other than exactly true.

Best Answer

Related Solutions

Solved – Decomposing total sum of squares

Solved – What F-test is performed by $\texttt{lm()}$ function in R, at the end of the output

Related Question