Solved – residual plot and non linearity

linear modelmultiple regressionresiduals

I was taught that linearity assumption in linear model can be checked by using the residuals plot. If there is a pattern then the assumption is most likely violated. Can someone explain the mechanisms behind this verification method? Why does the existence of a pattern mean non-linearity?

Best Answer

Let's say that you have data which looks like this:

plot(seq(-20:20), seq(-20:20)^2) - I am working in R by the way.

If you fit a straight line to it (i.e. y ~ x) using ordinary least squares regression, meaning you try and minimise the distance of the points from the line, you will end up with the line being above the points at the bottom, below the observations in the middle, and then above them again at the top. Have you accurately captured relationship between the two variables?

What will end up happening when you check your residual v.s. fitted values is you will see that points on the left side will be above zero, those in the middle below the zero line, and then above them on the right hand side... hopefully this sounds familiar. The residual plot is almost turning the graph on its side with the fitted line as the zero line, perpendicular to the x-axis, and the points showing their distance from the line for a given fitted value. It is essentially showing you that there is still some pattern in your data that you have not adequately captured because you tried to fit a straight line to curved data. A curve in your residuals suggests you should allow your line to curve by fitting whats called a polynomial. A quadratic is a polynomial which allows for a single curve in your line and has the relationship y ~ x^2. Cubic models allow for two bends (y ~ x^3) and so one.

In a linear model the assumption is that the residuals (i.e. the distance between the fitted line and the actual observations) is patternless, normally distributed with variance sigma^2 and mean 0. The patternless bit means that we have captured all pattern with our line. The sigma^2 is just a place holder but what it indicates is that the variance is constant. The mean of 0 is there to ensure that the residuals are roughly symmetric either side of the fitted line. Much of this we can check with a simple residuals v.s. fitted values plot as it can visually display some of these attributes (i.e. patternless, mean 0 and constant variance).

If my explanation isn't doing it for you Khan Academy gives another good explanation and walk-through here - hopefully this helps! Let me know if you need anything clarified.

I wouldn't personally use the term non-linear, just because there are things called non-linear models which are quite different from what we are talking about, and it can avoid some confusion down the line! I would instead call it a non-linear association between your terms, or a polynomial model. If you want to understand why I am against calling what we are talking about non-linear see this link.