[Math] Sum of Squares From Regression Formula in Matrix Form

statistical-inferencestatistics

I am trying to show that the regression sum of squares,

$$SS_{reg}=\sum(\hat{Yi} – \bar{Y})^2 = Y'(H – \frac 1 nJ)Y$$

where $H$ is the hat matrix and $J$ is a matrix of ones. I can do this using the fact that the total sum of squares minus the residual sum of squares equals the regression sum of squares but I'd like to try doing it without that.

The farthest I got was

$$\sum(\hat{Yi} – \bar{Y})^2 =(\hat Y – \bar Y)'(\hat Y – \bar Y) = \hat Y'\hat Y-\hat Y'\bar Y – \bar Y'\hat Y + \bar Y' \bar Y$$

The first term is

$$Y'HHY = Y'HY $$

aka the first term of what I am looking for. I don't know how to proceed from there.

Best Answer

Using your notation and $\bar{y}=\frac{1}{n}\sum Y_i$

$$\bar{Y}'\hat{Y}=\hat{Y}'\bar{Y}=\bar{y}\sum Y_i=n\bar{y}^2=\bar{Y}'\bar{Y}$$

So,

$$\sum(\hat{Y_i} - \bar{Y})^2 =(\hat{Y} - \bar{Y})'(\hat{Y} - \bar{Y}) = \hat Y'\hat{Y}-\hat{Y}'\bar{Y} - \bar Y'\hat{Y} + \bar Y' \bar Y=Y'HY-n\bar{y}^2$$ $$=Y'HY-\frac{1}{n}(Y'\mathbf{1})(\mathbf{1}'Y)=Y'HY-\frac{1}{n}Y'JY=Y'(H-\frac{1}{n}J)Y$$

where $J=\mathbf{1}\cdot \mathbf{1}'$ and $H=X(X'X)^{-1}X'$.