R – Understanding 0 Intercept in Logistic Regression in R

interceptinterpretationlogisticrregression

I'm exploring the effects of removing the intercept in a logistic regression model.

Assume a model:

$$logit(Y = 1) = \beta_1 x + \beta_2z + 0$$

with $x$ and $z$ being categorical variables with 2 levels each and no intercept.

I understood that having no intercept with categorical predictors produce coefficients that compare the $P(Y = 1)$ in each level of the two predictor against a null case where $P(Y=1) = 0.5$ or $logit(Y=1) = 0$.

I noticed a phenomenon that can understand. Using glm() function in R if you change the order of the variable in the right hand part of the formula, the coefficients change too. But even more oddly, the coefficient of the first variable is always the same.

Here's an R demo:

y <- as.factor(sample(rep(1:2), 30, T))
x <- as.factor(sample(rep(1:2), 30, T))
z <- as.factor(sample(rep(1:2), 30, T))

coef(glm(y ~ x + z - 1, binomial)
#        x1         x2         z2 
#-0.1764783  0.3260739 -0.1335192

coef(glm(y ~ z + x - 1, binomial))
#        z1         z2         x2 
#-0.1764783 -0.3099976  0.5025523 

As you can see the first predictors have the same coefficient while the other are different in the two models.

Here is what I expected and instead behave differently than what I though:

  1. Since every level of the two predictors is compared to the same null case, I expected to have the same coefficients in the two models, independently from the order in which I use them.
  2. I expected to see the coefficients of every level of every predictor, instead the coefficient for the 1 level of the second predictor is not shown.
  3. I therefore assume that only the first variable is compared against the null case, while the second is compared against a reference level; but what is this level? Is it $P(Y = 1 | X = 1 \cap Z = 1)$? Reproducing one of the models WITH the intercept we get:

    coef(glm(y ~ x + z - 1, binomial)
    #        x1         x2         z2 
    #-0.1764783  0.3260739 -0.1335192
    
    coef(glm(y ~ x + z, binomial))
    #(Intercept)         x2          z2 
    #-0.1764783   0.5025523  -0.1335192
    

As expected x1 become the intercept, and x2 is likely relative to x1. z1 is missing also in this case and z2 is the same as in the model without intercept.

Thus should I assume that the comparison against the null case $P(Y = 1) = 0.5$ is made only for the first variable in a formula, while the other are compared against the usual intercept?
Is this behavior normal?
What about the fact that the first coefficient has the same value whichever the order of the predictors in the formula?
What if I want to compare all level of each predictor against the null case and have a coefficient for all levels?
Or it's theoretically impossible for some reason I don't get?

Best Answer

The issue is not specific to a GLM. It's an issue of treatment contrasts.

You should also look at the model with intercept:

set.seed(42)
y <- as.factor(sample(rep(1:2), 30, T))
x <- as.factor(sample(rep(1:2), 30, T))
z <- as.factor(sample(rep(1:2), 30, T))

fit0 <- glm(y ~ z + x, binomial)
predict(fit0, newdata=data.frame(z=factor(2), x=factor(1)))
coef(fit0)
#(Intercept)          z2          x2 
# -0.1151303   0.3228803   1.0588217 
predict(fit0, newdata=data.frame(z=factor(2), x=factor(1)))
#      1 
#0.20775 

Here the intercept represents the group x1/z1 and the other group means are calculated by adding the coefficients of z2 and/or x2.

fit1 <- glm(y ~ z + x - 1, binomial)
coef(fit1)
#        z1         z2         x2 
#-0.1151303  0.2077500  1.0588217 
predict(fit1, newdata=data.frame(z=factor(2), x=factor(1)))
#      1 
#0.20775

Here the coefficient of z1 represents the group x1/z1 which is the same as the intercept in fit0. However, the coefficient of z2 represents the group x1/z2 instead of the difference between the group means. Note that 0.208 = -0.115 + 0.323. The x2/* group means are calculated by adding the x2 coefficient to the x1/* group means.

It should now be easy to understand why order matters here.

Related Question