Solved – How to explain the logistic regression model objects from the rms::lrm() function in R

logisticr

Recently I was trying to do logistic regression using the rms::lrm() function. But I had some trouble understanding the model objects from the function. Here is the example from the package:

#dataset
n            <- 1000    # define sample size
set.seed(17)            # so can reproduce the results
treat        <- factor(sample(c('a','b','c'), n,TRUE))
num.diseases <- sample(0:4, n,TRUE)
age          <- rnorm(n, 50, 10)
cholesterol  <- rnorm(n, 200, 25)
weight       <- rnorm(n, 150, 20)
sex          <- factor(sample(c('female','male'), n,TRUE))
L            <- .1*(num.diseases-2) + .045*(age-50) +
                (log(cholesterol - 10)-5.2)*(-2*(treat=='a') +
                3.5*(treat=='b')+2*(treat=='c'))
y            <- ifelse(runif(n) < plogis(L), 1, 0)
#fit model
g            <- lrm(y ~ treat*rcs(age))

> g

Logistic Regression Model

lrm(formula = y ~ treat * rcs(age))

                      Model Likelihood     Discrimination    Rank Discrim.    
                         Ratio Test            Indexes          Indexes       
Obs          1000    LR chi2      76.77    R2       0.099    C       0.656    
 0            478    d.f.            14    g        0.665    Dxy     0.312    
 1            522    Pr(> chi2) <0.0001    gr       1.945    gamma   0.314    
max |deriv| 3e-06                          gp       0.156    tau-a   0.156    
                                           Brier    0.231    
> anova(g)
                Wald Statistics          Response: y 

 Factor                                     Chi-Square d.f. P     
 treat  (Factor+Higher Order Factors)        5.62      10   0.8462
  All Interactions                           1.30       8   0.9956
 age  (Factor+Higher Order Factors)         65.99      12   <.0001
  All Interactions                           1.30       8   0.9956
  Nonlinear (Factor+Higher Order Factors)    2.23       9   0.9872
 treat * age  (Factor+Higher Order Factors)  1.30       8   0.9956
  Nonlinear                                  0.99       6   0.9858
  Nonlinear Interaction : f(A,B) vs. AB      0.99       6   0.9858
 TOTAL NONLINEAR                             2.23       9   0.9872
 TOTAL NONLINEAR + INTERACTION               2.57      11   0.9953
 TOTAL                                      69.06      14   <.0001

Here are my questions:
For the object g,

  • What does the max |deriv| 3e-06 mean?
  • What do the Discrimination and Rand Discrim. Indexes suggest?

For the anova(g) object,

  • What's the Factor +Higher Order Factors for the treat?
  • Why there are two all interactions? How to explain the nonlinear parts?

Best Answer

Max |deriv| is the maximum (over $\beta$s) of the absolute value of the first derivative of the log-likelihood function at the apparent maximum likelihood estimates. Being close to zero is a good indication that convergence happened. There are a lot of background sources explaining the various indexes, especially my course notes at http://biostat.mc.vanderbilt.edu/CourseBios330. To interpret anova, type ?print.anova.rms to see the various printing options for anova objects. These options include printing subscripts of $\beta$s that are tested in a given row, or printing a matrix of asterisks to the right of the statistics with blank spaces for $\beta$s not included in the current test.