Regression Model – Can Non-Significant Coefficients Be Ignored

linear modelmodel selectionregression coefficientsregression-strategiesstatistical significance

After seeking clarification about linear model coefficients over here I have a follow up question concerning non-signficant (high p value) for coefficients of factor levels.

Example: If my linear model includes a factor with 10 levels, and only 3 of those levels have significant p values associated with them, when using the model to predict Y can I choose to not include the coefficient term if the subject falls in one of the non-signficant level?

More drastically, would it be wrong to lump the 7 non-significant levels into one level and re-analyze?

Best Answer

If you are putting in a predictor variable with multiple levels, you either put in the variable or you don't, you can't pick and choose levels. You might want to restructure the levels of your predictor variable to decrease the number of levels (if that makes sense in the context of your analysis.) However, I'm not sure if this would cause some type of statistical invalidation if you're collapsing levels because you see they are not significant.

Also, just a note, you say small $p$-values are insignificant. I assume that you meant small $p$-value are significant, ie: a $p$-value of .0001 is significant and therefore you reject the null (assuming an $\alpha$ level of $> .0001$?).

Related Question