Solved – GLM with a Poisson distribution, how to conduct a post-hoc test

biostatisticsgeneralized linear modelmodeling

I have a dataset in which I compare the total number (the summed number of birds over a couple of weeks) of birds across different distances (continuous factor). I hypothesize that the numbers of birds is the highest at the first distance and declines. The model I used was a GLM.

I now have a significant effect of the Distance, which is what I was expecting. But what can I use to see which of the distances differ from each other? I've tried this but that didn't work.

post1 <- glht(m5, family = poisson())
summary(m5, test = adjusted())

The emmeans package does not work.

This question can be deleted, the answer did not work and I did not conduct a post-hoc.

Best Answer

You want the emmeans package. It is remarkably broad and effective. Below, I simulate data that might look like yours. Then I calculate all pairwise comparisons and then just the contrast of interest that you are interested in.

# generate fake data that might look like what you want ------------------------
set.seed(1839) # for replicability
n <- 400 # sample size
Distance <- factor(sample(1:5, n, TRUE)) # make distance from 1 to 5 and factor
# make rain and method covariates that are unrelated to outcome: 
Rain <- rbeta(400, 4, 2)
Method <- rbinom(n, 1, .4)
# calculate number that distance 1 is higher than the other 4:
Number <- rpois(n, ifelse(Distance == "1", 4, 2))

# fit model --------------------------------------------------------------------
model <- glm(Number ~ Distance + Rain + Method, family = "poisson")

# use emmeans ------------------------------------------------------------------
library(emmeans)
# all post-hoc comparisons:
pairs(emmeans(model, ~ Distance))

# just the contrast you wanted:
contrast(emmeans(model, ~ Distance), method = "trt.vs.ctrl", ref = 1)
# see ?contrast-methods for more information on types of methods of contrasts
# as well as how to specify the reference (that is, it should be the level value
# and not the label)

There are tons of great vignettes to help learn emmeans, e.g.:

Related Question