I have a problem with an analysis.
I'm doing a binomial glm with two categorical factors that are loc and trat.
I do not understand how R deals with the intercept (what statistical explanation does R have to select the intercept it wants)? Because it uses the first factor as an intercept and it also compares the second factor with the intercept that has nothing to do with it.
y <- cbind(data1$fr,data1$fl-data1$fr)
loc1 <- as.factor(data1$loc)
trat1 <- as.factor(data1$trat)
m2 <- glm(y~loc1 + data1$comp + trat1, family=binomial,na.action=na.omit,data=data1)
summary(m2)
Call:
glm(formula = y ~ loc1 + data1$comp + trat1, family = binomial,
data = data1, na.action = na.omit)
Deviance Residuals:
Min 1Q Median 3Q Max
-3.4015 -0.9895 -0.4015 -0.1713 6.1668
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.20524 0.20315 -15.778 < 2e-16 ***
loc12 1.06667 0.18642 5.722 1.05e-08 ***
loc13 0.52656 0.19319 2.726 0.006419 **
loc14 0.69228 0.21151 3.273 0.001064 **
data1$comp 0.21967 0.06314 3.479 0.000503 ***
trat1anemo -4.78819 1.00885 -4.746 2.07e-06 ***
trat1autogam -3.75418 0.59252 -6.336 2.36e-10 ***
trat1autopol -1.28546 0.23312 -5.514 3.51e-08 ***
trat1control 0.49978 0.14277 3.501 0.000464 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 832.26 on 171 degrees of freedom
Residual deviance: 327.39 on 163 degrees of freedom
AIC: 565.92
Number of Fisher Scoring iterations: 7
Maybe someone here could help me?
Best Answer
R orders factor variables alphabetically by default. If you want a specific group to be a referent (baseline) group, then you should explicitly tell R. Let us see this with an example using a simulated variable mimicking your variable $loc1$:
Note the change in the reference variable with the $new\_loc1$ variable.
The interpretation of the intercept is: the log-odds of the outcome for the reference group of loc1 and trat1 when $data1\$comp=0$. If you exponentiate the intercept, i.e. $e^{-3.20524}=0.041$, you will get the odds of the outcome for the reference group of $loc1$ and $trat1$ when $data1\$comp=0$. If $data1\$comp$ variable never takes the value of zero, then, the intercept may not have meaningful interpretation. For further lesson on working with factor variables please refer here and for further lessons on interpretation of categorical predictors, please refer here.