Solved – Multicollinearity and categorical predictor with three levels

categorical datamulticollinearitypredictorregression

If I have a continuous Dependent Variable and two Independent Variables, where one is categorical with three levels and the other is continuous, what assumptions do I need to check for multiple regression?

Scatter plots are for continuous variables and multicollinearity makes sense for continuous, but not for dummy variables.

Best Answer

The most important assumptions to check are those for any multiple regression, as explained for example in Faraway's "Practical Regression and Anova using R," Chapter 7: tests for outliers and influential observations, a plot of residuals versus fitted values (an extremely useful scatter plot that incorporates both the categorical and the continuous predictor), tests of non-linearity and distributions of residuals, and so forth.

"Multicollinearity" would seem to be a bit of an overstatement with only 2 predictor variables. If you are concerned about collinearity, you could for example see how the values of the continuous predictor are distributed among the 3 levels of the categorical predictor. The Faraway reference noted above discusses collinearity in Chapter 9. As the answer from @jur notes, its practical importance depends on the intended use of the model.