Solved – Predicted probabilities and marginal effects relationship (R, margins package)

logitmarginal-effectr

Working on a logit model, i get the following results:

Predicted Probs.:

#(GlmModel, x2[in this case, years])
> logit2prob(m3, 0.5)
  0.3178728
> logit2prob(m3, 1)
0.2200801 
> logit2prob(m3, 2)
0.0937683 
> logit2prob(m3, 3)
0.03655361 
> logit2prob(m3, 4)
0.01372108 
> logit2prob(m3, 5)
0.005075335 
> logit2prob(m3, 6)
0.001867019

Marginal Effects:

> head(marginal_effects(m3))
      dydx_x2
1 -0.20786769
2 -0.21936998
3 -0.20786769
4 -0.20330973
5 -0.11609512
6 -0.04823988

And graphs for both using cplot(m3, "x2", what = "predict") and cplot(m3, "x2", what = "effect"):

The numbers i get from marginal_effects doesn't seems to match "effect" clplot. And both instantaneous marginal effects (table and graph) doesn't seems to match predicted values rate of change. What am i missing here?

Best Answer

First, I must confess that I don't understand your use of the logit2prob function. I suspect this function comes from the rcfss package. If so, the example below shows how it can be used to compute predicted probabilities from a binary logistic regression model. The data for this example come from https://stats.idre.ucla.edu/r/dae/logit-regression/.

# install rcfss package 
library(devtools) 
install_github("uc-cfss/rcfss")

require(rcfss)

# read data for use in binary logistic regression model example 

mydata <- read.csv("https://stats.idre.ucla.edu/stat/data/binary.csv")

# convert rank variable to factor (i.e., categorical variable)

mydata$rank <- factor(mydata$rank)

# fit binary logistic regression model to data which will predict 
# probability of admission to college for a student based 
# on their gre, gpa and current school rank/prestige 
# (ranks range from 1 = highest prestige to 4 = lowest prestige)

mymodel <- glm(admit ~ gre + gpa + rank, 
               data = mydata, 
               family = binomial(link="logit"))

# display model summary

summary(mymodel)

# compute the log odds of admission for each student represented in the data

predict(mymodel)

# convert the log odds computed above to predicted probabilities 
# of admission using logit2prob()

logit2prob(predict(mymodel))

# calculate the predicted probabilities of admission directly using 
# the predict() function

predict(mymodel, type="response")

# compare the two methods for computing predicted probabilities 
# to ensure they yield the same results 

cbind(logit2prob(predict(mymodel)), predict(mymodel, type="response"))

As for the conditional plot produced by cplot, I can't figure out how exactly it is computed via the command:

## conditional effect plot
cplot(mymodel, "gre", what="predict", draw=FALSE)

If I use a different way of computing conditional probabilities (as per the effects package), I can replicate the reasoning involved:

require(effects)

allEffects(mymodel)

 model: admit ~ gre + gpa + rank

 gre effect
gre
  200       400       500       700       800 
0.1503641 0.2177438 0.2587605 0.3544499 0.4077938 

 gpa effect
 gpa
      2.3       2.7       3.1       3.6         4 
0.1505682 0.1964650 0.2521985 0.3351680 0.4101640 

rank effect
rank
        1         2         3         4 
0.5166016 0.3522846 0.2186120 0.1846684

Indeed:

# extract model coefficients 
b0 <- coef(mymodel)[1]
b1 <- coef(mymodel)[2]
b2 <- coef(mymodel)[3]
b3 <- coef(mymodel)[4]
b4 <- coef(mymodel)[5]
b5 <- coef(mymodel)[6]


# compute proportion of data values represented in each 
# category of rank
table(mydata$rank)/sum(table(mydata$rank))

log.odds <-  b0 + b1*(200) + b2*mean(mydata$gpa)+ b3 *(0.3775) + 
             b4*(0.3025) + b5*(0.1675)

exp(log.odds)/(1+exp(log.odds))

In the above log odds formula, I plugged in 200 for gre (which is the same value used by the effects package), I replaced gpa with its observed mean value and replaced the dummy variables used to encode the effect of rank with their mean value (i.e., with the proportion of values falling into the categories they represent). Exponentiating the log odds enabled me to obtain the first predicted probability obtained by the effects package (i.e., 0.1503641) when gre is set to 200, gpa is set to its observed mean value and the dummy variables rank2, rank3 and rank4 are set to their observed mean values. It seems that the effects package computes what the documentation of the margins package refers to as "Fitted values at the mean of X" (i.e., predicted probabilities at the mean values of the non-focal predictor variables, evaluated over a range of values for the focal variable gre).

I would have thought that cplot uses a similar argument to obtain conditional probabilities, but it doesn't. What I think it does is compute what the documentation of the margins package refers to as "Average" fitted values (i.e., average predicted probabilities). By documentation, I mean the vignette available at https://cran.r-project.org/web/packages/margins/vignettes/TechnicalDetails.pdf. In other words, for each gre value in a grid, gpa and rank are set in turns to all the combinations of values observed in the data. The predicted probability of admission is computed for each given gra value across all of these combinations of gpa and rank values and then averaged across combinations.

In the above-mentioned vignette, the author of the margins package clarifies that, for binary logistic regression models, the margins function computes marginal effects as changes in the predicted probability of the outcome corresponding to changes in the values of a focal predictor. If the focal predictor is categorical (e.g., rank), changes are expressed moving from the base category to a particular category (e.g., from rank = 1 to rank = 2). If the focal predictor is numeric (e.g., gpa), changes correspond (by default) to an infinitesimal increment in the value of that predictor. However, a discrete change in the values of a numeric predictor can be requested via the change argument to the workhorse dydx() function, which allows for expression of changes quantified by the following: observed minimum to observed maximum values of the focal predictor, first quartile to the third quartile of the focal predictor, mean +/- a standard deviation of the focal predictor, or any arbitrary change. See http://www.brodrigues.co/blog/2017-10-26-margins_r/ for an example involving the use of the dydx() function.

By default, margins calculates the average marginal effects of every variable included in the glm model.

Related Solutions

Solved – Presenting marginal effects of logit with fixed effects

In R the effects package can easily help with interpreting such coefficients by producing the appropriate graphs. From CRAN:

effects: Effect Displays for Linear, Generalized Linear, and Other Models

Graphical and tabular effect displays, e.g., of interactions, for various statistical models with linear predictors.

The package comes with two papers by John Fox describing the use of the package (along with examples) and the underlying implementation. The package takes good care of dealing with the subtle details and implementation difficulties for the supported models.

As an example:

require(car)
require(effects)
data(Cowles)
cowles.mod <- glm(volunteer ~ sex + neuroticism*extraversion,
                  data=Cowles, family=binomial)
Anova(cowles.mod)
plot(effect("neuroticism*extraversion", cowles.mod))
plot(effect("sex", cowles.mod))  ##for dummy, but should be similar for factors

Which will yield:

> Anova(cowles.mod)
Analysis of Deviance Table (Type II tests)

Response: volunteer
                         LR Chisq Df Pr(>Chisq)    
sex                        4.9184  1   0.026572 *  
neuroticism                0.3139  1   0.575316    
extraversion              22.1372  1  2.538e-06 ***
neuroticism:extraversion   8.6213  1   0.003323 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

And for the graph, which displays the predicted values of Y on the vertical axis, and the effect of X ("neuroticism") on Y given various levels of Z ("extraversion"):

enter image description here

Here's the effect plot for the dummy (but it should work similarly for factors):

enter image description here

Lastly, there seems to be some confusion surrounding partial effects (holding other predictors constant) vs marginal effects (ignoring other predictors). In my understanding the effects package is concerned with displaying "partial effects".

Logistic – How to Calculate Marginal Effect by Hand with Logit and Dummy Variables

For connected dummies $d$ and $x$, you might want to calculate this average of finite differences:

$$AME_x =\frac{1}{N} \cdot \sum_{i=1}^N \left[ \hat p(d=1,x=0,z=z_i)-\hat p(d=0,x=1,z=z_i) \right],$$

where $\hat p(.)$ is the predicted probability from the logit model. I don't know what encoding you are using, so replace that for the ones and zeros above.

Note that std(x)/1000 for continuous variables is not quite right. If you recall the definition of derivatives, the limit of the change goes to zero. You are considering a tiny perturbation, not one of particular size that depends on the SD of x.

Best Answer

Related Solutions

Solved – Presenting marginal effects of logit with fixed effects

Logistic – How to Calculate Marginal Effect by Hand with Logit and Dummy Variables

Related Question