I have a large data set where I relate the response variable to multiple explanatory variables; since I have different areas I have also included a random factor.
The response variable is binomial and therefore I use the glmer
function from the lme4
package.
The explanatory variables have different scales and to be able to compare the estimates I wanted to standardize the estimates.
For that I use a standardisation method that has been developed by Gelman (2007), which is available in the arm
package.
Another method would be fine as well, however I use this for a different model, and I would like to use the same method to standardize my data.
However if I use this method, I get different $p$-values:
# without standardized data:
model1 <- glmer(bembryo ~ (s_edlength + s_bplength + s_tide)^2 + (1|Areasite), family=binomial(link = "logit"), nAGQ = 1, data=data)
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.81791 2.86350 -0.635 0.5255
s_edlength 12.33513 5.52290 2.233 0.0255 *
s_bplength -8.77016 4.74700 -1.847 0.0647 .
s_tide 1.54429 1.38453 1.115 0.2647
s_edlength:s_bplength -0.01579 0.14525 -0.109 0.9134
s_edlength:s_tide -4.77805 2.23256 -2.140 0.0323 *
s_bplength:s_tide 3.47744 1.89254 1.837 0.0661 .
# With standardized data:
model.full.stan <- standardize(model1)
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 3.1441 0.7192 4.372 1.23e-05 ***
z.s_edlength 5.9579 2.4137 2.468 0.0136 *
z.s_bplength -4.0340 2.1221 -1.901 0.0573 .
z.s_tide -1.3594 1.1632 -1.169 0.2425
z.s_edlength:z.s_bplength -0.1263 1.2467 -0.101 0.9193
z.s_edlength:z.s_tide -10.4140 4.9042 -2.123 0.0337 *
z.s_bplength:z.s_tide 7.9670 4.3625 1.826 0.0678 .
I am not really sure why this is happening.
I checked if it depends on the standardization method I use.
However, if I just use the function rescale
to scale my explanatory variables I also get different $p$-values.
I do not get different $p$-values when there is only one explanatory variables left, however that is not really helpful.
This same problem occurs when I use a lme
function from the nlme
package.
Although for this function the method of Gelman (2007) is not possible, I also get different $p$-values compared to the non-standardized model.
I am not sure why this is happening and I really would like to use standardized estimates, so I would hope that someone has a idea why this is happening.
Best Answer
The phenomenon you're seeing is not specific to
glmer
or mixed models. It is a consequence of (1) centering as well as scaling your input variables; (2) including interactions in your model. If you only scale, and don't center your variables (e.g. by usingscale(.,center=FALSE)
), or if you drop the interactions from the model, then you should see the magnitudes of your coefficients change, but the $Z$-statistics and $p$-values should remain identical. If you didn't have interactions in the model, then your estimated slopes would represent the marginal change in the response per unit of the predictor; because you have interactions, your estimated slopes are the change in the response per unit of the predictor at the zero value of the other variables included in the interaction; this makes the estimates sensitive to centering the other input variables.