Interaction Term – How to Interpret Interaction Terms in Regression

categorical datainteractioninterpretationregression coefficientsself-study

I have a model:
$$
\ln({\rm earnings}) = a+b_1{\rm female}+b_2{\rm white}+b_3{\rm female}\times{\rm white}
$$
${\rm female}$ and ${\rm white}$ are dummy variables.

I have interpreted $b_1$ and $b_2$:

  • $b_1$ = change in female earnings comparing to male given you are non white
  • $b_2$ = change in white earnings comparing to non white given you are male

But I am unable to interpret the coefficient of the interaction term ($b_3$). Please help me with this.

Let me make it more clear what I need out of this regression
$$
\ln({\rm earnings}) = 2.618656-.0899657{\rm female}+.382019{\rm white}-.2754126 {\rm female}\times{\rm white}
$$ Now i know there is gender pay difference with b1, i also know there is race pay difference with b2. Now with b3 i need to know is their a gender pay gap for whites only. How can i figure that out with regression above and without test.

Best Answer

$b_3$ is the difference between white females and the sum of $a+b_1+b_2$. That is, the difference between white females and the sum of non-white males plus the difference between non-white females and non-white males plus the difference between white males and non-white males.
\begin{align} b_3 = \bar x_\text{white female} - \big[&\ \ \bar x_\text{non-white male}\quad\quad\quad\quad\quad\quad\quad\ \ + \\ &(\bar x_\text{non-white female} - \bar x_\text{non-white male}) + \\ &(\bar x_\text{white male}\quad\quad\! - \bar x_\text{non-white male})\quad\ \big] \end{align}
Honestly, it's a bit of a mess to interpret in this way. More typically, we interpret the test of $b_3$ as a test of the additivity of the effects of ${\rm white}$ and ${\rm female}$. (The expression within the square brackets $[]$ is the additive effect of ${\rm white}$ and ${\rm female}$.) Then we make more substantive interpretations only of simple effects (i.e., the effect of one factor within a pre-specified level of the other factor). People rarely try to interpret the interaction effect / coefficient in isolation.

It may also help you to read my answer here: Interpretation of betas when there are multiple categorical variables, which covers an analogous, but simpler, situation without the interaction.

Related Question