Solved – Testing the difference between two parameter estimates in binomial GLM

binomial distributiongeneralized linear modelrstandard error

There are some related posts on this issue, but no answers actually demonstrate the mechanics of how to accomplish the task that I could find. I want to compare two parameter estimates in a binomial GLM (but I expect the answer to this question will work for any GLM).

My model is of the form:

y ~ b0 + b1x1 + b2x2 + e(binom)

My question: How do you test for a difference between b1 and b2?

Running this model using glm() in R provides parameter estimates and standard errors for all b. So comparing them should be easy, but I'm just having a hard time figuring out exactly what to do.

Here are some example data to work with:

set.seed(1987)
y <- rbinom(n = 100, size = 1, prob = 0.5) # binomial response variable
set.seed(1988)
x1a <- rnorm(n = 100, mean = 2, sd = 3)
set.seed(1988)
x1b <- rnorm(n = 100, mean = 0, sd = 3)
x1 <- ifelse(y == 0, x1a, x1b) # negative relationship between y and x1
set.seed(1990)
x2a <- rnorm(n = 100, mean = 2, sd = 5)
set.seed(1990)
x2b <- rnorm(n = 100, mean = 15, sd = 5)
x2 <- ifelse(y == 0, x2a, x2b) # a strong, positive relationship between y and x2

Run this model and return relevant output:

an1 <- glm(y ~ x1 + x2, family=binomial)
s_an1 <- summary(an1)

s_an1$coefficients 
    paste("residual df = ", an1$df.residual)
paste("null.df = ", an1$df.null)

And get these estimates:

              Estimate Std. Error   z value     Pr(>|z|)
(Intercept) -4.2711210  1.0451565 -4.086585 4.377686e-05
x1          -0.1638442  0.1433595 -1.142891 2.530841e-01
x2           0.6130596  0.1378431  4.447519 8.686793e-06

residual df = 97
null df = 99

Notice that x2 is a significant positive predictor of y, and x1 is non-significant, negative predictor of y. As we all know, however, this is not good evidence that the parameter estimates are different.

(How) Can I use the standard errors to compare the estimates?

I'll point out that there appears to be a lot of information on the web about how to calculate confidence intervals, which are nice, but I want:

1) a parameter estimate for the difference (yes, I know — it's just the difference between them),

2) a standard error (probably sqrt(se.x1^2 + se.x2^2)),

3) a t-value (or equivalent distribution statistic), and

4) a p-value.

In my real-world scenario, x1 and x2 are measured on the same scale, but don't have the same mean or standard deviations (as in the example data above — see sd arguments to random number generators). Is standardization of some kind necessary in this case?

Help much appreciated, cheers!

Best Answer

You can simulate parameter estimates using the mvtnorm package with mean vector coef(an1) and covariance matrix vcov(an1) and then summarise them. Or you could bootstrap.

However, it's probably easier just to use the multcomp package to examine the contrast. In your model that would be:

library(multcomp)
cont <- glht(an1, linfct="x1 - x2 = 0")
summary(cont) ## estimate, standard error, z-statistic and p-value
confint(cont) ## confidence interval