R – Mediation vs Interaction in R Application

causalitymediationrregression

I am struggling to understand how/if the interaction is connected to mediation.
I understand that the interaction in a regression indicates that a variable Z influences the effect of a variable X on the outcome (Y). I am also aware that the variable X can influence Y through another variable, a mediator, in this case, we call it M. This mediation can be either complete or partial, meaning that X can influence Y directly and through M.

However, what I do not quite understand is how these two aspects are related.
I guess I am wrong in assuming that if X interacts with Z, I can decompose the influence of Z on the effect of X on Y using mediation analysis.

What I would like to do is to identify all the variables (Y and Z) that have a combined effect on Y. I then want to decompose the effect of X and Z on Y in:

  1. interaction (X*Z->Y)
  2. direct effect (X->Y)
  3. mediated effect (X->Z(M)->Y)

Are these simply different problems that require independent regression and mediation analyses, or there is a way to explore the above simultaneously? Is there an R package that does that?
Thank you very much!

Best Answer

Interaction and mediation are different things.

In mediation, we have a causal pathway where one variable causes the mediator and the mediator causes the outcome.

In interaction, we have a joint action, where two variables are associated with an outcome, but the "effect" of one variable depends on the value of the other variable.

Clearly these are different things. If we were to do a simple simulation, we might proceed as follows, in R.

First we simulate for an interaction:

set.seed(1)
X <- rnorm(500)
Z <- rnorm(500)
Y <- X + Z + X*Z + rnorm(500) 
lm(Y ~ X * Z)

And we find:

## Coefficients:
## (Intercept)            X            Z          X:Z  
##   -0.006785     0.967882     0.927355     0.973669 

as expected. In particular, we see that the interaction has an estimate close to 1.

Now, for mediation:

set.seed(1)
X <- rnorm(500)
M <- X + rnom(500)
Y <- X + M + rnorm(500)

Now some care is needed. If we fit the model lm(Y ~ X + M) we obtain:

## Coefficients:
## (Intercept)            X            M  
##   -0.005709     1.043180     0.925210 

So, here the estimate for X, 1.04 is the direct effect of X on Y, and 0.92 is the indirect (mediated) effect. Typically in inference we would like to total effect, which should obviously be close to 2, and we can obtain that with:

lm(Y ~ X)
## Coefficients:
## (Intercept)            X  
##    -0.04731      1.92853  

as expected.