Logistic Regression – How to Compare Logistic Regression Curves Effectively

bernoulli-distributionhypothesis testinglogisticlogistic-curve

I have a binary dependent variable whose probability depends on a continuous independent variable, i.e., age. I have fit these into a logistic regression model and have the coefficients for the same.

The problem is that I have 3 such curves (for 3 different groups – categorical) and I need to show that these three are statistically different from each other. How could I show that one logistic regression curve is not the same as the other?

Note: Sample sizes for the three groups are different

Best Answer

I'd suggest putting all three datasets into one. Include in this dataset an indicator variable for each of the three datasets. Then, fit a logistic regression using this complete dataset, including an interaction term of age and the indicator variable. Additionally, fit a logistic regression model that does not include the interaction term but additive terms for age and the indicator. Perform an likelihood ratio test to evaluate the evidence that the interaction term is significant. That is, compare the models

$$ \mathrm{logit}(y)=\beta_0 + \beta_1\cdot \mathrm{age} + \beta_2\cdot \mathrm{g_2} + \beta_3\cdot \mathrm{g_3} + \beta_4\cdot \mathrm{age}\cdot \mathrm{g_2} + \beta_5\cdot \mathrm{age}\cdot \mathrm{g_3} $$

and

$$ \mathrm{logit}(y)=\beta_0 + \beta_1\cdot \mathrm{age} + \beta_2\cdot g_2 + \beta_3\cdot g_3 $$

where $\mathrm{g_2}$ and $\mathrm{g_3}$ are indicator variables (dummy variables) for groups 2 and 3. $g_2$ is $1$ whenever the group is $2$ and $0$ otherwise. $g_3$ is $1$ whenever the group is $3$ and $0$ otherweise. Group 1 serves as reference group. You can change the reference group by omitting the corresponding dummy variable in the model. Here, we omitted the indicator variable for group 1 which consequently serves as references group.

The likelihood ratio test tests the following hypotheses

\begin{align} \mathrm{H}_{0}&: \beta_4 = \beta_5 = 0 \\ \mathrm{H}_{1}&: \text{At least one}\> \beta_j \neq 0, j = 4, 5 \end{align}

So it tests whether the differences between the age slope of group 1 (or the reference group) and the groups 2 and 3 differ significantly. It doesn't tell you, however, whether the slopes of group 2 and 3 differ (see next section). If the likelihood ratio test is significant, you have evidence that at least two groups have different slopes.

Here is an example using R. I'm using an available dataset that models the probability of admission into a graduate school. Instead of age, it uses GPA and instead of the group indicator it uses ranks 1-4 which indicates the prestige of the institution. For your problem, just change GPA to age and rank to the group-indicator.

dat <- read.csv("https://stats.idre.ucla.edu/stat/data/binary.csv")

dat$rank <- factor(dat$rank)

glm_mod <- glm(admit~gpa*rank, family = binomial, data = dat)
glm_mod_noint <- glm(admit~gpa+rank, family = binomial, data = dat)

summary(glm_mod)

# Coefficients:
#              Estimate Std. Error z value Pr(>|z|)  
# (Intercept)  -4.6298     2.4843  -1.864   0.0624 .
# gpa           1.3906     0.7171   1.939   0.0525 .
# rank2         0.9191     2.9618   0.310   0.7563  
# rank3         0.1892     3.2431   0.058   0.9535  
# rank4        -1.1121     4.1319  -0.269   0.7878  
# gpa:rank2    -0.4661     0.8584  -0.543   0.5871  
# gpa:rank3    -0.4567     0.9295  -0.491   0.6232  
# gpa:rank4    -0.1378     1.1999  -0.115   0.9085

summary(glm_mod_noint) # output omitted here

Now you see that the slope for the GPA is $1.39$ for institutions with rank 1. For institutions with rank 2, the slope is $1.391 - 0.466 = 0.925$, the difference in slopes is $-0.466$. The corresponding $p$-value is $0.587$, providing little evidence for a difference in slopes. Accordingly, the difference in slopes compared to institutions of rank 1 for rank 3 is $-0.457$ and $-0.138$ for institutions of rank 4. All corresponding $p$-values are quite high. Let's do the likelihood ratio test now:

# Likelihood ratio test

anova(glm_mod_noint, glm_mod, test = "Chisq")

# Analysis of Deviance Table
# 
# Model 1: admit ~ gpa + rank
# Model 2: admit ~ gpa * rank
#   Resid. Df Resid. Dev Df Deviance Pr(>Chi)
# 1       395     462.88                     
# 2       392     462.49  3  0.38446   0.9434

The high $p$-value of the likelihood ratio test indicates little evidence that the influence of the GPA on the probability of admission differs between institutions of different ranks. We would therefore assume that $\beta_5 = \beta_6 = \beta_7 = 0$ in this case (i.e. the coefficients of the interaction terms are all $0$).

Let's visualize our results on the probability scale: Fitted probabilities of admission

An now on the log-odds scale:

Log-odds of admission

As you can see, the lines differ slightly in their slope, but these differences are not statistically significant. Informally, the interaction term assesses the evidence for parallel lines. In this case, we can assume that the lines are parallel on the log-odds scale.

Pairwise comparisons

The likelihood ratio test tests whether all differences in slopes between the reference group and the other groups are $0$ (see hypotheses above). If we'd like to test all pairwise differences in slopes, we have to resort to a post-hoc test. If there are $n$ groups, there are $\frac{1}{2}(n - 1)n$ pairwise comparisons. In your case, there are $n = 3$ groups which leads to a total of $3$ comparisons. In this example with four groups (ranks 1-4), there are $6$ comparisons. We will use the multcomp package to conduct the tests:

library(multcomp)

glht_mod <- glht(glm_mod, linfct = c(  "gpa:rank2 = 0"                 # rank 1 vs. rank 2
                                       , "gpa:rank3 = 0"               # rank 1 vs. rank 3
                                       , "gpa:rank4 = 0"               # rank 1 vs. rank 4
                                       , "gpa:rank2 - gpa:rank3 = 0"   # rank 2 vs. rank 3
                                       , "gpa:rank3 - gpa:rank4 = 0"   # rank 3 vs. rank 4
                                       , "gpa:rank2 - gpa:rank4 = 0")) # rank 2 vs. rank 4
summary(dd) # all pairwise tests

# Linear Hypotheses:
#                             Estimate Std. Error z value Pr(>|z|)
# gpa:rank2 == 0             -0.466105   0.858379  -0.543    0.947
# gpa:rank3 == 0             -0.456729   0.929507  -0.491    0.960
# gpa:rank4 == 0             -0.137839   1.199946  -0.115    0.999
# gpa:rank2 - gpa:rank3 == 0 -0.009376   0.756521  -0.012    1.000
# gpa:rank3 - gpa:rank4 == 0 -0.318890   1.129327  -0.282    0.992
# gpa:rank2 - gpa:rank4 == 0 -0.328266   1.071546  -0.306    0.990

These $p$-values are adjusted for multiple comparisons. There is very little evidence for any difference between slopes. Finally, let's test the global hypothesis that all pairwise slope comparisons are $0$:

summary(glht_mod, Chisqtest()) # global test

# Global Test:
#    Chisq DF Pr(>Chisq)
# 1 0.3793  3     0.9445

Again, very little evidence that any of the slopes differ from each other.

Related Question