Solved – Main effects flip sign when adding interaction term in ordinal logistic regression

ordinal-dataregression

I am running an ordinal logistic regression.

Dependent Variable: policy score (0-3).

Independent Variables: all continuous scale (GDP, corruption perception, total number of mines)

All Independent Variables have a positive correlation with the dependent variable.

When I run the regression without interactions, all covariates are positive.

When I add an interaction, GDP*corruption is positive, but their seperate effects become negative..

Can someone explain why this is happening?

Also, there's no multicolinearity (VIF < 1).

Best Answer

First, as AlexK pointed out, this flipping of signs can happen because of the nature of multiple regressions. The coefficient in a multiple regression gives you the marginal effect of that variable with all other terms in the model held constant.

Let's look at interaction effects, but in a highly simplified 2x2 design. The principle can of course be extended to continuous variables, it is just a lot easier to think about it this way. Imagine your data set looked like this, with the means in the cells and the marginal means around it.

                   GDP
                   Low   High
Corruption    Low   10      8   9
              High   8     20  14
                     9     14

As you can see, both GDP and Corruption have a positive main effect, because in both "High" groups, means are higher. The model would be:

y = b0 + b1*GDP + b2*Corr (with both IV either 0 or 1).

Estimating coefficients would lead to both b1 and b2 being positive, due to the high mean when both are 1. The model has no chance but to estimate coefficients in a way that fit the data best, and with this model, it will estimate two positive coefficients.

20 = b0 + b1*1 + b2*1
 8 = b0 + b1*0 + b2*1
 8 = b0 + b1*1 + b2*0
10 = b0 + b1*0 + b2*0

Let's include the interaction term (GDP*Corr), which will actually be 0 in all but the first equation.

20 = b0 + b1*1 + b2*1 + b3*1*1
 8 = b0 + b1*0 + b2*1 + b3*0*1
 8 = b0 + b1*1 + b2*0 + b3*1*0
10 = b0 + b1*0 + b2*0 + b3*0*0

Now the model has a chance to account for the "diagonal" effect of both IVs being "High" together. The high mean of 20 can now be accounted for by b3. The model can estimate marginal effects holding the interaction effect constant and show that the remaining (=marginal) effects reflected by coefficients b1 and b2 are actually negative.

I had to add: The interpretation of something like this can become tricky very quickly (imagine three-way interactions). If you have a significant interaction, be very careful how you interpret it and any remaining main effects!

Related Question