Mediation analysis with a log-transformed mediator

assumptionslinearmediationrregression

The very basic framework for mediation analysis (as I understand it) is below (DV = dependent variable, IV = independent variable):

Step 1: DV ~ IV
Step 2: Mediator ~ IV
Step 3: DV ~ IV + Mediator – check if the effect of the IV is reduced or lost after controlling for the mediator

However, if the mediator has to be log-transformed in step 2 to improve normality of residuals, should it also be log-transformed in step 3 (bolded below)? I have been told yes by one mentor, as it is a carry-through of the same analysis. If it should be, it would look like below. In my case the DV also had to be log-transformed, so I’ll include that as well.

Step 1: log(DV) ~ IV
Step 2: log(Mediator) ~ IV
Step 3: log(DV) ~ IV + log(Mediator) ?

In the example above, the DV and Mediator were log-transformed in steps 1 and 2, respectively, to ensure normality of residuals in those models.

Happy to provide specific variable names and R code, but the question is a general one and may not need it.

Best Answer

First, the process you have outlined is the Baron and Kenny approach. It isn't wrong, but it's quite old-fashioned and more modern approaches are available. See e.g. McGill University

Second, I agree with Rhys. If you are going to transform, you have to do it in all steps. Otherwise, you have a mess. Maybe it won't violate any statistical "rules" but it will be very hard to interpret.

Finally, why transform to get to normal residuals? Instead, use a different kind of regression that does not assume normal residuals. Two of these are robust regression and quantile regression. I would only transform if it made substantive sense (this is often the case with monetary variables). When possible, don't fit the data to the model, fit the model to the data.