Regression – Can Interaction Term of Two Insignificant Coefficients be Significant?

correlationinteractionlinear modelregression

Lets say I have a linear regression with two numeric explanatory variables: A and B.

Consider the following scenarios:

  1. A and B are both insignificant
  2. A is significant, B is insignificant; or the other way around
  3. A and B are both significant

Now, my question:

In which scenarios is it possible (or should we differentiate between "theoretically possible" and "likely" here?) that the interaction term A * B is going to be significant?

Best Answer

$A*B$ can be significant in all of these scenarios. Consider $A \in \{-1, 0, 1\}$ and $B \in \{-1, 1\}$ where the underlying model is $E[Y|A,B] = A*B$. In a roughly balanced situation, with (roughly) equal sample sizes for each combination of $A \times B$, neither $A$ nor $B$ will be significant (except for the $\alpha$ fraction of the time when a true null hypothesis is rejected), but the interaction term certainly will be! Here's a numeric example:

A <- rep(c(-1,0,1), 100)
B <- rep(c(-1,1), 150)
X <- A*B
Y <- X + rnorm(300)

> summary(lm(Y~A+B+A*B))

Call:
lm(formula = Y ~ A + B + A * B)

Residuals:
     Min       1Q   Median       3Q      Max 
-3.03520 -0.59349 -0.03184  0.62857  2.49359 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -0.02083    0.05668  -0.367    0.714    
A           -0.03797    0.06942  -0.547    0.585    
B            0.05867    0.05668   1.035    0.301    
A:B          0.90789    0.06942  13.078   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.9818 on 296 degrees of freedom
Multiple R-squared: 0.3681, Adjusted R-squared: 0.3617 
F-statistic: 57.47 on 3 and 296 DF,  p-value: < 2.2e-16 

Or, more simply:

> cor(A,Y)
[1] -0.02527534
> cor(B,Y)
[1] 0.04782935
> cor(A*B,Y)
[1] 0.6042723

It should be intuitively clear that if we can construct an example where $A$ and $B$ are both insignificant, yet the interaction is significant, we can do so for either of your other two cases.

As for likely... One could argue that in real life, apart from physics and a few other disciplines, pretty much all interaction terms are very likely to be nonzero (albeit perhaps very small), and "significance" in its statistical sense is merely a function of sample size.

Related Question