Solved – Predicted probabilities and marginal effects relationship (R, margins package)

logitmarginal-effectr

Working on a logit model, i get the following results:

Predicted Probs.:

#(GlmModel, x2[in this case, years])
> logit2prob(m3, 0.5)
  0.3178728
> logit2prob(m3, 1)
0.2200801 
> logit2prob(m3, 2)
0.0937683 
> logit2prob(m3, 3)
0.03655361 
> logit2prob(m3, 4)
0.01372108 
> logit2prob(m3, 5)
0.005075335 
> logit2prob(m3, 6)
0.001867019

Marginal Effects:

> head(marginal_effects(m3))
      dydx_x2
1 -0.20786769
2 -0.21936998
3 -0.20786769
4 -0.20330973
5 -0.11609512
6 -0.04823988

And graphs for both using cplot(m3, "x2", what = "predict") and cplot(m3, "x2", what = "effect"):
Predict x Marginal Fx

The numbers i get from marginal_effects doesn't seems to match "effect" clplot. And both instantaneous marginal effects (table and graph) doesn't seems to match predicted values rate of change. What am i missing here?

Best Answer

First, I must confess that I don't understand your use of the logit2prob function. I suspect this function comes from the rcfss package. If so, the example below shows how it can be used to compute predicted probabilities from a binary logistic regression model. The data for this example come from https://stats.idre.ucla.edu/r/dae/logit-regression/.

# install rcfss package 
library(devtools) 
install_github("uc-cfss/rcfss")

require(rcfss)

# read data for use in binary logistic regression model example 

mydata <- read.csv("https://stats.idre.ucla.edu/stat/data/binary.csv")

# convert rank variable to factor (i.e., categorical variable)

mydata$rank <- factor(mydata$rank)

# fit binary logistic regression model to data which will predict 
# probability of admission to college for a student based 
# on their gre, gpa and current school rank/prestige 
# (ranks range from 1 = highest prestige to 4 = lowest prestige)

mymodel <- glm(admit ~ gre + gpa + rank, 
               data = mydata, 
               family = binomial(link="logit"))

# display model summary

summary(mymodel)

# compute the log odds of admission for each student represented in the data

predict(mymodel)

# convert the log odds computed above to predicted probabilities 
# of admission using logit2prob()

logit2prob(predict(mymodel))

# calculate the predicted probabilities of admission directly using 
# the predict() function

predict(mymodel, type="response")

# compare the two methods for computing predicted probabilities 
# to ensure they yield the same results 

cbind(logit2prob(predict(mymodel)), predict(mymodel, type="response"))

As for the conditional plot produced by cplot, I can't figure out how exactly it is computed via the command:

## conditional effect plot
cplot(mymodel, "gre", what="predict", draw=FALSE)

If I use a different way of computing conditional probabilities (as per the effects package), I can replicate the reasoning involved:

require(effects)

allEffects(mymodel)

 model: admit ~ gre + gpa + rank

 gre effect
gre
  200       400       500       700       800 
0.1503641 0.2177438 0.2587605 0.3544499 0.4077938 

 gpa effect
 gpa
      2.3       2.7       3.1       3.6         4 
0.1505682 0.1964650 0.2521985 0.3351680 0.4101640 

rank effect
rank
        1         2         3         4 
0.5166016 0.3522846 0.2186120 0.1846684 

Indeed:

# extract model coefficients 
b0 <- coef(mymodel)[1]
b1 <- coef(mymodel)[2]
b2 <- coef(mymodel)[3]
b3 <- coef(mymodel)[4]
b4 <- coef(mymodel)[5]
b5 <- coef(mymodel)[6]


# compute proportion of data values represented in each 
# category of rank
table(mydata$rank)/sum(table(mydata$rank))

log.odds <-  b0 + b1*(200) + b2*mean(mydata$gpa)+ b3 *(0.3775) + 
             b4*(0.3025) + b5*(0.1675)

exp(log.odds)/(1+exp(log.odds))

In the above log odds formula, I plugged in 200 for gre (which is the same value used by the effects package), I replaced gpa with its observed mean value and replaced the dummy variables used to encode the effect of rank with their mean value (i.e., with the proportion of values falling into the categories they represent). Exponentiating the log odds enabled me to obtain the first predicted probability obtained by the effects package (i.e., 0.1503641) when gre is set to 200, gpa is set to its observed mean value and the dummy variables rank2, rank3 and rank4 are set to their observed mean values. It seems that the effects package computes what the documentation of the margins package refers to as "Fitted values at the mean of X" (i.e., predicted probabilities at the mean values of the non-focal predictor variables, evaluated over a range of values for the focal variable gre).

I would have thought that cplot uses a similar argument to obtain conditional probabilities, but it doesn't. What I think it does is compute what the documentation of the margins package refers to as "Average" fitted values (i.e., average predicted probabilities). By documentation, I mean the vignette available at https://cran.r-project.org/web/packages/margins/vignettes/TechnicalDetails.pdf. In other words, for each gre value in a grid, gpa and rank are set in turns to all the combinations of values observed in the data. The predicted probability of admission is computed for each given gra value across all of these combinations of gpa and rank values and then averaged across combinations.

In the above-mentioned vignette, the author of the margins package clarifies that, for binary logistic regression models, the margins function computes marginal effects as changes in the predicted probability of the outcome corresponding to changes in the values of a focal predictor. If the focal predictor is categorical (e.g., rank), changes are expressed moving from the base category to a particular category (e.g., from rank = 1 to rank = 2). If the focal predictor is numeric (e.g., gpa), changes correspond (by default) to an infinitesimal increment in the value of that predictor. However, a discrete change in the values of a numeric predictor can be requested via the change argument to the workhorse dydx() function, which allows for expression of changes quantified by the following: observed minimum to observed maximum values of the focal predictor, first quartile to the third quartile of the focal predictor, mean +/- a standard deviation of the focal predictor, or any arbitrary change. See http://www.brodrigues.co/blog/2017-10-26-margins_r/ for an example involving the use of the dydx() function.

By default, margins calculates the average marginal effects of every variable included in the glm model.