Solved – Interpretation of multiple dumthe variables based on one categorical variable in a regression

categorical datainterpretationregression

I have a question regarding the interpretation of multiple dummy variables based on one categorical variable in a regression:

Suppose I have a categorical variable called ‘race’ which has four categories: white, black, Indian and Asian. For this categorical variable, I’ve created four separate dummy variables: white, black, Indian and Asian.

I now run a multiple regression which includes three (white, black, Indian) of the four dummy variables created above; one (Asian) is left out to set the reference category/dummy to compare the others to.

In the output, one dummy variable is significant (Indian) but the other two aren’t (white & black). My question is; what exactly does this mean?
As far as I understand it means that the one dummy variable that is significant (Indian) is of more significant influence on the model than the (reference) dummy that was left out (Asian).
But what do the two dummy variables (white & black) mean that are not significant? Does it mean they have no significant influence than the (reference) dummy variable that was left out (Asian)?

Best Answer

As you said: it basically means that there is a difference in your dependent variable between your reference race (Asian in your case) and people of Indian race.

The non-significance of the other two indicates that you haven't been able to detect such a difference between the other two races and the reference race (which of course does not mean that there isn't one). So no difference in the dependent variable detected for black vs Asian, and no difference in the dependent variable detected for white vs Asian.

Related Question