Solved – Ways of Testing Linearity Assumption in Multiple Regression apart from Residual Plots

assumptionslinearitymultiple regression

I was going through the assumptions of linear regression and of course one of them was linearity between the dependent and the independent variables – to be precise I should say that the assumption is the conditional mean of $Y_i$ given $X_i$ is linear in the parameters.

I looked in many textbooks and resources online and all of them suggested to check that assumption through a scatter plot of the residuals versus the fitted values. Although I can see that this is a valid and helpful way, I can't help but notice that it can be a bit arbitrary and subjective in some cases.

My question is if there is a statistical test to examine that assumption as well. For example when testing heteroscedasticity we can see the residual plot but we also have Levene's test.

I can see in that in How can I use the value of $R^2$ to test the linearity assumption in multiple regression analysis? ,which is very helpful, it stated the R squared is not that statistic but doesn't mention anything as a viable alternative.

Thanks in advance

Best Answer

What you can do is fit a model that relaxes the linearity assumption, using, e.g., splines, and compare it with the model that assumes linearity. For example, in R, for a linear regression model you can do something like that:

library("splines")

# linear effect of age on y
fm_linear <- lm(y ~ age + sex, data = your_data)

# nonlinear effect of age on y using natural cubic splines
fm_non_linear <- lm(y ~ ns(age, 3) + sex, data = your_data)

# F-test between the two models
anova(fm_linear, fm_non_linear)