If two simple OLS coefficients are positive, can they flip signs during multiple OLS?

linear modelregressionself-study

I was asked this during an interview, and I'm curious if my thinking is correct.

Fit linear regression twice to two features, $x_1$ and $x_2$. You get two coefficients $\beta_1$ and $\beta_2$, both greater than $1$. Now fit linear regression to both features at the same time. Can either coefficient be negative?

My intuition is that yes, the coefficient sign can flip, if $x_1$ and $x_2$ are collinear. OLS parameter estimates are unstable here since the normal equation requires inverting the Gram matrix $\mathbf{X}^{\top} \mathbf{X}$, which has the same rank as $\mathbf{X}$. (1) Am I correct and (2) if so, is my analysis thorough? Not sure if there's anything else I should consider here or a better way to explain why the coefficients can flip signs.

Best Answer

Yes, they can flip sign if they are correlated. Arguing this mathematically is likely possible, but we can just demonstrate that this can happen with simulation.



set.seed(0)
# Generate correlated covars 
X = MASS::mvrnorm(100, c(0,0), matrix(c(1, 0.99, 0.99, 1), nrow = 2))
# Use them to generate observations.  Only the first column has effect on y
y = X %*% c(2, 0) + rnorm(100, 0, 0.4)

# Estimate 3 models: 2 with only one variable and 1 with both
m1 = lm(y~X[,1])
coef(m1)
>>> (Intercept)      X[, 1] 
 0.02606534  2.03186570 


m2 = lm(y~X[,2])
coef(m2)
    (Intercept)      X[, 2] 
 >>>0.04038971  1.96816682 

m = lm(y~X)
coeff(m)
>>> Coefficients:
(Intercept)           X1           X2  
    0.02581      2.07047     -0.03831  

```
Related Question