Solved – dumthe variables, interaction with continuous variable, and variable selection

categorical datainteractionstepwise regression

I want to predict shop sales from a set of independent variables which consists of shop attributes like floor space, no. of stuff of a specific store (continuous variables) and also location of the store which is a categorical variable (binary coding) like east west south. I have some questions.

  1. If I run stepwise regression for variable selection and if one of the included dummy variable gets dropped , what does that mean?

  2. Is it necessary to include interaction terms with dummy variables and continuous variables if they are significant? I am asking this because my motive is to predict sales.

  3. Should I include interaction terms before running stepwise regression?

Best Answer

  1. If by dummy variables you're referring to multiple binary variables that make up one categorical predictor, each of them needs to be in the model for each other dummy to be meaningful. In stepwise regression either they are all in or all out, but not piecemeal. Are you doing this by hand or something? All stats packages I'm familiar with treat multilevel categoricals properly in this respect, and shouldn't consider dummy variables independently for model specification.

  2. Again, you can't include interactions with some dummy variables of a single categorical predictor but not others. All in or all out. The test of whether the interaction needs to be included is a comparison between a model without interactions with all dummies and a model with interactions with all dummies. If the interaction is significant, you should keep it in any case. Just be aware that the interpretation of the "main effects" changes drastically when interactions are included in models.

  3. If doing backwards stepwise regression, include the interaction terms.

Related Question