Solved – The best way to check the linearity assumption in ANOVA (except using scatter plots)

linearity

I was wondering that ANOVA is based on linearity assumption. If Yes, how can I check the linearity between observations. I know using plot would be useful but it is difficult to find the degree of polynomial trend. Is there another way to check the linearity assumption.

Best Answer

There is no formal linearity assumption regarding variables in a linear regression and there can also be non-linear interaction terms between different categorical variables in an ANOVA. For ANOVAs as well as regressions, there are assumptions of homoskedasticity and normality of residuals (and some others).

Those assumptions are equivalent to the response variable being linearly related to a continuous predictor in a regression. If there was a non-linear relationship that your model doesn't capture, the residuals would not be homoskedastic. They would be larger at those areas where the real trend diverges from your line.

You could still add a higher order term, perhaps the square of the variable, to your regression's predictors and see if that solves your homoskedasticity problem. For example

$$ Y = \beta X^2 + \epsilon$$

is a perfectly acceptable linear regression. The linearity lies at the $\beta$ parameter, not the variable $X$. As long as $\epsilon$ is normal and homoskedastic the assumptions of the regression are met. An example of what you cannot do would be

$$Y = X^{\text{sin}(\beta)} + \epsilon$$

since it is not linear in $\beta$ nor easy to render linear (as you sometimes can by taking the log of all terms)

In an ANOVA with only categorical predictors, you have the assumptions that the residuals (after subtracting a constant value per category) are normal within each category and have the same variance for the different categories.