Solved – Should I re-center variables when looking at moderator effect in men and women separately

centeringgroup-differencesinteractionregression

I want to see if an interaction variable in a multiple regression is significant for the whole sample, and then just for men and just for women. When I created the interaction variable for the whole sample, I centered the interaction components by subtracting the mean for the whole sample.

Now, when I want to look at men and women separately should I recalculate male and female specific centered and interaction variables, centering them with the respective male and female sample means for the interaction components?

Best Answer

Centring: Centring does not change the significance of the r-square change of your interaction effect. It also will not change the values you get for a simple slopes analysis.

Thus, for most purposes it does not matter whether you centre or not. This applies both to the general analysis, and to the subgroup analysis.

The main benefit of centring is that it can make the interpretation of the regression coefficients a little easier. If you want to compare these absolute size of these coefficients across males and females, then you should only centre once.

Prefer integrated models: A better suggestion is to include gender in your overall multiple regression. For example, if you have DV, IV1, IV2 and gender and you are interested in the IV1 * IV2 interaction for each gender. I'd examine various models such as:

DV ~ IV1 + IV2 + gender
DV ~ IV1 * IV2 + gender
DV ~ IV1 * IV2 + gender * IV1 + gender*IV2
DV ~ IV1 * IV2 * gender

If you get a significant gender by something interaction, then you may wish to further explore this using separate analyses, but I'd start with the overall integrated model.

Illustrating points about centered predictors:

The following code returns the p-value of the r-square change and the final r-square for both an uncentered and three centred versions (global, female centred, male centred) of an interaction effect model.

library(MASS)
survey <- na.omit(survey)
head(survey)

x <- survey[, c('Sex', 'Wr.Hnd', 'NW.Hnd', 'Pulse')]
names(x) <- c('gender', 'iv1', 'iv2', 'dv')
x$scaled_iv1 <- scale(x$iv1, scale=FALSE)
x$scaled_iv2 <- scale(x$iv2, scale=FALSE)
x$female_scaled_iv1 <- scale(x$iv1, center=mean(x[x$gender == "Female", 'iv1']), scale=FALSE)
    x$female_scaled_iv2 <- scale(x$iv2, center=mean(x[x$gender == "Female", 'iv2']), scale=FALSE)
x$male_scaled_iv1 <- scale(x$iv1, center=mean(x[x$gender == "Male", 'iv1']), scale=FALSE)
    x$male_scaled_iv2 <- scale(x$iv2, center=mean(x[x$gender == "Male", 'iv2']), scale=FALSE)

compare_fits <- function(x) {
    fit1 <- lm(dv ~ iv1+iv2, x)
    fit2 <- lm(dv ~ iv1*iv2, x)
    fit3 <- lm(dv ~ scaled_iv1*scaled_iv2, x)
    fit4 <- lm(dv ~ male_scaled_iv1*male_scaled_iv2, x)
    fit5 <- lm(dv ~ female_scaled_iv1*female_scaled_iv2, x)
    results <- list()
    results$p_normal <-  anova(fit1, fit2)[2,6]
        results$p_centered <- anova(fit1, fit3)[2,6]
    results$p_centered_male <- anova(fit1, fit4)[2,6]
        results$p_centered_female <- anova(fit1, fit5)[2,6]
    results$rsq_normal <- summary(fit2)$r.squared
    results$rsq_centered <- summary(fit3)$r.squared
    results$rsq_centered_male <- summary(fit4)$r.squared
    results$rsq_centered_female <- summary(fit5)$r.squared
    unlist(results)
}

# The following results report p-values and rsq for final model
# using normal (i.e., uncentered) and centered predictors
compare_fits(x)
compare_fits(x[x$gender=='Male', ])
    compare_fits(x[x$gender=='Female', ])

The results show how the values do not vary across uncentered and centered analyses.

> compare_fits(x)
           p_normal          p_centered     p_centered_male   p_centered_female          rsq_normal 
        0.241816265         0.241816265         0.241816265         0.241816265         0.009982317 
       rsq_centered   rsq_centered_male rsq_centered_female 
        0.009982317         0.009982317         0.009982317 
> compare_fits(x[x$gender=='Male', ])
               p_normal          p_centered     p_centered_male   p_centered_female          rsq_normal 
             0.14034102          0.14034102          0.14034102          0.14034102          0.03055692 
           rsq_centered   rsq_centered_male rsq_centered_female 
             0.03055692          0.03055692          0.03055692 
    > compare_fits(x[x$gender=='Female', ])
           p_normal          p_centered     p_centered_male   p_centered_female          rsq_normal 
          0.5196788           0.5196788           0.5196788           0.5196788           0.0128802 
       rsq_centered   rsq_centered_male rsq_centered_female 
          0.0128802           0.0128802           0.0128802 
Related Question