Solved – Lack of fit test in multiple regression

multiple regressionregressionself-study

Suppose I have this model
$$Y=B_0+B_1X_1+B_2X_2$$
and these observations

Y  <- c(64, 73, 61, 76, 72, 80, 71, 83, 83, 89, 86, 93, 88, 95, 94, 100)
X1 <- c(4, 4, 4, 4, 6, 6, 6, 6, 8, 8, 8, 8, 10, 10, 10, 10)
X2 <- c(2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4)

I know how to calculate SSLF (1) and SSPE (2) in R, but I want to know how to do it at hand
$$SSLF=SSE-SSPE$$
where
$$SSE=\sum(Y_i-\hat{Y_i})^2$$
enter image description here

setting up the table this way is easy to calculate the sum of squares of pure error, but there is an easier way to do this?

(1) SSLF: sum of squares of lack of fit

(2) SSPE: sum of squares of pure error

Best Answer

Let me give you a hint. The SSPE is made up of squared deviations from the means at each $X$ level. Let's denote the number of $X$ levels by $c$, then

$$SSPE=\sum_j^c \sum_{i}^{n_j} \left(Y_{ij} - \bar{Y_j} \right)^2$$.

Simply put, for a replicate, i.e. an identical $X$ value, you compute the mean of the the corresponding $Y$s and sum the squared deviations from it. It is easy to see that any $X$ level with no replications makes no contribution to SSPE because the mean is just that one observation!

Hope this clears it up a bit.

Related Question