Solved – Regression: why test normality of overall residuals, instead of residuals conditional on $\hat{y}$

assumptionsregression

I understand that in linear regression the errors are assumed to be normally distributed, conditional on the predicted value of $y$. Then we look at the residuals as a kind of proxy for the errors.

It's often recommended to generate output like this: Normal Q-Q plot of unstandardized residual. However, I don't understand what the point is of obtaining the residual for each data point and mashing that together in a single plot.

I understand that we are unlikely to have sufficient data points to properly assess whether we have normal residuals at each predicted value of $y$.

However, isn't the question of whether we have normal residuals overall a separate one, and one that doesn't clearly relate to the model assumption of normal residuals at each predicted value of $y$? Couldn't we have normal residuals at each predicted value of $y$, while having overall residuals that were quite non-normal?

Best Answer

Couldn't we have normal residuals at each predicted value of y, while having overall residuals that were quite non-normal?

No -- at least, not under the standard assumption that the variance of the errors is constant.

You can think of the distribution of overall residuals as a mixture of normal distributions (one for each level of $\hat{y}$). By assumption, all of these normal distributions have the same mean (0) and the same variance. Thus, the distribution of this mixture of normals is itself simply a normal distribution.

So from this we can form a little syllogism based on modus tollens: if P then Q; not Q; therefore not P. In this case we have: If the individual distributions given the values of the predictor X are normal (and their variances are equal), then the distribution of the overall residuals is normal. So if we observe that the distribution of overall residuals is apparently not normal, this implies that the distributions given X are not normal with equal variance. Which is a violation of the standard assumptions.

@BigBendRegion points out something in the comments that I think is worth adding to this answer for emphasis. The line of argument I outlined above works for refuting normality, but it cannot be used to confirm normality. That is, if we check the marginal distribution of residuals and see that it does appear normal, this does NOT entail that the residuals conditional on X are normal (see HERE for counterexamples). In terms of the P and Q statements above, observing that Q is true does not entail that P is true. That would be affirming the consequent.