Say I have a binary dependent variable (Choice) being either 0 or 1, and people answer this DV multiple times. I also evenly split people in two groups (Group, group A vs group B).
I simulate data so that I know that there's an overall 50% probability to choose 1 in group A, but only an overall 5% probability to choose 1 in group B. The average sample probability to choose 1 (both groups combined) is around 27.5%.
I then run a generic generalized linear mixed model with a binomial distribution : Choice ~ Group + (1 | participant).
If I rely on dummy coding, the estimates for the intercept makes sense to me. That is, if I put A = 0 and B = 1, thus choosing A as reference group, the intercept can be transformed into the overall probability to choose a 1 in group A. This checks out, as I obtain ~50%. Same goes when I code A = 1 and B = 0, thus choosing B as reference group : I get a 5% probability to choose a 1 in group B.
However, when I rely on contrast coding, I'm getting lost. If I code A = -1/2 and B = 1/2, I'm expecting to observe the average probability (as 0 is the value between the two groups), thus around 27.5%. But when I transform the intercept of this contrast-coded model into probability, I'm obtaining 18%. Why is that ?
Best Answer
With contrast coding, the intercept is the average of the group effects on the logit scale, not the average of the group probabilities.
You mention that the sampling is balanced. This information is irrelevant for your question: the expected value of the intercept changes with the coding/parametrization but not with the sampling proportions. The number of participants in each group determines how efficiently we can estimate group parameters: fewer participants, larger standard errors.