Solved – Beta coefficient interpretion with categorical and continuous predictors in a linear regression

categorical dataregression coefficients

I am trying to run a linear regression with both categorical and continuous predictors. I have coded the categorical predictor (with three levels) into three dummy variables, and entered the two dummy variables into the regression along with the continuous predictors. My question is: How do I interpret the beta coefficients of the categorical predictors? I can interpret the other beta coefficients from the continuous predictors as "_% of the variance in x can be explained by y" but I am not sure how to interpret the beta from the categorical predictors. The t statistic for both of my dummy variables were significant, but I'm having difficulty interpreting it in the context of the full regression. Any help would be appreciated!

Thank you!

Best Answer

Your interpretation of the continuous predictors you have entered in the regression model seems to be somewhat mistaken. A more appropriate way to understand it would be "the expected increase/decrease in the dependent variable for one unit change in the independent variable". It appears that you have confused it somewhat with the interpretaiton of the R2 of the total regression model. The interpretation of dummy variables follows the same principle. You can conceptualize it as the expected increase/decrease in the dependent variable for a change from 0 to 1 in the independent variable.

Imagine you have dummy coded a variable representing gender and for the sake of this example let Male=0 and Female=1. Let's say the dependent variable is time (in seconds) to complete a 100 m race. An unstandardized regression coefficient of +1.5 would suggest that if the independent variable is 1 (=female) an increase of 1.5 seconds in the time required to run 100 m is expected in comparison to males (condition male=0). Notice that what I said here relates to unstandardized regression coefficients; however, the discussion wouldn't differ as much for standardized regression coefficients.

In the context of a multiple regression the interpretation of a dummy independent variable wouldn't be different to what I just described, it's just that the regression coefficient should be interpreted under the assumption that you have controlled for the remaining independent variables in the model.

Related Question