Solved – Post Hoc Test of interaction factor in binomial glmm with proportions

binomial distributionglmmpost-hoc

I've performed a binomial glmm, because my data are proportions of a species in a sample of +/- 100 Individuals.
I test the interaction of two factors and use car::Anova to get the p-values.
My random factor is the ID of the field (subject) that was sampled. I sampled 12 fields from 4 different classes (factor1), therefore i think the levels of the random factor is 12. The reason I use the random factor is that i want to correct for repeated measurements. Each field was measured only two times. These two timepoints are the levels of factor 2.

my model:

binomial.glmm <- glmer(cbind(species1, not species1) ~ factor1*factor2 + (1|field), family=binomial(link="logit"), data)

On the one side, I'm using the glht() function from the multcomp package to perform a post-Hoc Tukey test with bonferroni adjustment (all pairwise comparisons).
And on the other side, I addtionally plot fitted values with confidence intervals.

My problem ist, that
A: the a,b,c,… letters from the post hoc test do not make sense in my opinion.
And B: The confidence intervals are incredibly large.

Now I'm thinking that the umber of real replicates (3) in each class is too small? Could that be the reason? Am i siply unable to test for the interaction effect?

I get the confidence intervals with the following R code:

testdata = expand.grid(factor1=unique(data$factor1),
                       factor2 = unique(data$factor2))


X <- model.matrix(~ factor1*factor2, data = testdata)
testdata$fit <- X %*% fixef(binomial.glmm)
testdata$SE <- sqrt(  diag(X %*%vcov(binomial.glmm) %*% t(X))  )
testdata$upr=testdata$fit+1.96*testdata$SE
testdata$lwr=testdata$fit-1.96*testdata$SE

Then i plot the fit, upr and lwr, after calculating the exp(), using ggplot:

ggplot(testdata, aes(x = factor1, y = exp(fit))) + 
          #geom_bar(stat="identity",position = position_dodge(1), col="454545", size=0.15, fill="grey") +
          geom_point(aes(x=as.numeric(factor1)+0.3),pch=23, bg="aquamarine2") + 
          geom_errorbar(aes(x=as.numeric(factor1)+0.3, ymin = exp(lwr), ymax = exp(upr)),position = position_dodge(1),col="black",width=0.15, size=0.15) + 
          geom_boxplot(aes(y=response), data=data) +
          facet_grid(.~factor2) +
          geom_hline(xintercept = 1, size=0.15) +
          ylab("Species1?") +
          xlab("Factor1") +
          scale_x_discrete(labels=c("A", "B", "C", "D"))

Here are the plots:

Results from the post Hoc tukey Test, unfortuneately i wasn't able to make a pretty plot. The letters can't be true!!??

Predicitions plots. The blue points represent the fit with confidence interval, The boxplots are obtained from the real data

Interestingly, the prediction plots get much better, when i use a lmm with arcsine transformed fractions of Species1 as response. However everybody argues that this would not be gould practice,…

The residual plots to validate the models are a bit misbehaved, although i included the random term (1|ID) with ID = 1:nrow(data).
The problem is, that if this does not improve the model, nothing could improve the model and at least there is only one single package (nparLD) that is able to perform non-parametric test with interactions with data that have repeated measurements. And i I'm afraid that procedure would become too difficult to explain in a matherial and methods section.

Best Answer

You might try the lsmeans package, as it makes some of this stuff easier and clearer. To get the results on the back-transformed scale (include option type="response"), do

require(lsmeans)
lsm <- lsmeans(binomial.glmm, ~ factor1 * factor2)
summary(lsm, type = "response")

To see the results graphically, do

plot(lsm, by = "factor2", intervals = TRUE, type = "response")

or, for an interaction-plot style,

lsmip(lsm, factor1 ~ factor2, type = "response")

In a mixed model like this, it is often the case that the SEs of these least-squares means (AKA predictions) will be much larger than those of some or all of the pairwise differences, because the between-subjects variations cancel out in those comparisons. So the displayed CIs can be very misleading for comparing the predictions (and you shouldn't use CIs to do comparisons in any case).

To get the Tukey-adjusted comparisons, do

summary(pairs(lsm), type = "response")

(This actually computes the differences on the logit scale, then back-transforms, so that the results are odds ratios. If you want differences of proportions instead, do pairs(regrid(lsm)) instead.)

Related Solutions

Solved – Post hoc test for factorial ANCOVA in R

First of all, you have lots of interactions in play here, so it is very likely highly misleading to just do simple marginal comparisons. That said, the summary results are actually a sequential anova table, with terms entered in the order shown in the rows of the table. You might want to try

library("car")
Anova(Duration1)   # note the capital "A" in "Anova"

Each test in this table is conditional on all other terms in the model that do not contain the one in question.

Second, I am still concerned that you get the model right before proceeding to post-hoc stuff. Many people try to find the shortest route to a P value and -- in the process -- base their inferences on a model that doesn't fit the data. If that's the case here, then that's another way to be doing meaningless things. Did you do any residual plots? Did you explore whether the linear trend in Temperature is reasonable?

More specifics on these matters... It is sometimes the case that a response transformation will improve the residual distribution and also remove or reduce interactions. Your response is measured on a ratio scale, and it may be that a log or square root or even a reciprocal makes for a better-fitting model. Changing the model so that Temperature is replaced by poly(Temperature, 2) or poly(temperature, 3) would make it possible to see if the quadratic or cubic effects are needed.

Also, since Size is a factor, I hope you coded it as a factor in your model even though it has only two levels. That doesn't change the anova at all, but it makes the results easier to interpret. In what follows, I am assuming that Size is of class "factor".

With two factors and a covariate involved, perhaps TukeyHSD can be made to work right, but I will instead shamelessly suggest using my lsmeans package, which is designed for multi-factor situations. LS means are the model's predictions over a regular grid of factor combinations, or marginal averages thereof. See the documentation for details.

First off, get an idea visually of what is going on. This can be done with an interaction-plot-style display:

library("lsmeans")
lsmip(Duration1, Stage ~ Temperature | Size, 
      at = list(Temperature = c(15, 20, 25, 30)) )

You'll see the temperature trend for each stage and size. Since there are interactions, these lines will be of different slopes. If there is, however, some regularity among these slopes, it may be reasonable to ignore some of the interactions even though they are statistically significant. (Also, if larger predictions tend to differ more than smaller predictions, that's a situation that shows some hope of being ameliorated by a response transformation.) If the lines go all over the place, then you can't ignore the interactions. You might also want to look at other plots (e.g., with Temperature ~ Stage | Size) to get different perspectives.

Now, to compare sizes or stages, we should (probably) do that separately for each combination of the other two factors. For example:

Size.lsm = lsmeans(Duration1, "Size", by = c("Temperature", "Stage"),
                    at = list(Temperature = c(15, 20, 25, 30)) )
Size.lsm
contrast(Size.lsm, "pairwise")   # or just  pairs(Size.lsm)

From the above, you'll actually get 28 tables of means, and 28 tables of comparisons. [Do test(pairs(Size.lsm), by = NULL, adjust = "mvt") to get one table with all 28 comparisons and a multi-variate $t$ adjustment for the 28 tests.]

You could do pairwise comparisons of the 4 temperatures in an analogous way. However, since this is a quantitative factor, you could just estimate and compare the slopes of the lines in the plot produced earlier, like this:

Temp.lst = lstrends(Duration1, "Stage", by = "Size", var = "Temperature")
Temp.lst
pairs(Temp.lst)
pairs(Temp.lst, by = "Stage")   # compares the two sizes at each stage

This is long-winded, but I hope it gives you an idea of how you might proceed. Again, get the model right first before proceeding to any of the post-hoc analyses.

Solved – Post-hoc test for binomial GLM with some cases having probabilities of 1

Complete separation occurs in logistic (and binomial, and Poisson) regression when some categories contain 100% failures (or zero counts) or (in the logistic/binomial cases) 100% successes. In this case, the 'true' estimates are infinite (because logistic regression parameters are estimated on the logit scale, and logit(0) $\to -\infty$ while logit(1) $\to \infty$). Logistic regression typically leads to extreme (but not infinite) values of the parameter estimates, e.g. $|\hat \beta| > 8$. Worse, the usual Wald standard errors also tend to be very large, because the approximation of a quadratic log-likelihood surface that the Wald SEs depend on is very bad.

There's more explanation of complete separation here.

A reasonable solution is Firth or bias-reduced logistic regression, implemented in R's brglm2 package as follows:

## install.packages("brglm2")
library(brglm2)
m1 <- glm(cbind(totalalive,totaldead) ~ f1 * f2, 
    family = binomial, df, method=brglmFit)

Best Answer

Related Solutions

Solved – Post hoc test for factorial ANCOVA in R

Solved – Post-hoc test for binomial GLM with some cases having probabilities of 1

Related Question