Logistic – Intercept of Logistic Regression with Contrast Coding for Better Model Interpretation

contrastsinterceptlogistic

Say I have a binary dependent variable (Choice) being either 0 or 1, and people answer this DV multiple times. I also evenly split people in two groups (Group, group A vs group B).

I simulate data so that I know that there's an overall 50% probability to choose 1 in group A, but only an overall 5% probability to choose 1 in group B. The average sample probability to choose 1 (both groups combined) is around 27.5%.

I then run a generic generalized linear mixed model with a binomial distribution : Choice ~ Group + (1 | participant).

If I rely on dummy coding, the estimates for the intercept makes sense to me. That is, if I put A = 0 and B = 1, thus choosing A as reference group, the intercept can be transformed into the overall probability to choose a 1 in group A. This checks out, as I obtain ~50%. Same goes when I code A = 1 and B = 0, thus choosing B as reference group : I get a 5% probability to choose a 1 in group B.

However, when I rely on contrast coding, I'm getting lost. If I code A = -1/2 and B = 1/2, I'm expecting to observe the average probability (as 0 is the value between the two groups), thus around 27.5%. But when I transform the intercept of this contrast-coded model into probability, I'm obtaining 18%. Why is that ?

With contrast coding, the intercept is the average of the group effects on the logit scale, not the average of the group probabilities.

# Use logit and inv_logit for clarity
logit <- function(x) qlogis(x)
inv_logit <- function(x) plogis(x)

# (Expected) Intercept
(logit(0.5) + logit(0.05)) / 2
#> [1] -1.472219

# Probability for the "average" group
inv_logit(-1.472219)
#> [1] 0.1866056


You mention that the sampling is balanced. This information is irrelevant for your question: the expected value of the intercept changes with the coding/parametrization but not with the sampling proportions. The number of participants in each group determines how efficiently we can estimate group parameters: fewer participants, larger standard errors.