Logistic – How to Calculate Marginal Effect by Hand with Logit and Dummy Variables

logisticmarginal-effect

I have the following dilemma:

I understand-ish what marginal effects are, also the calculation of it, derivation of the sigmoid function and how to interpret it (as a the change in probability by increasing your variable of interest by "a little bit", this little bit being 1 for discrete vars or by a std(x)/1000 for continuous ). Now, the part I find tricky is to corroborate the results of the marginal effects by hand and recalculating the probabilities for x=0 and then x=1 (for example) and then get a difference in probability equal to the marginal effect I got earlier, I am particularly stuck with dummy variables since If I increase one, I have to decrease the other one, so I am not so sure how to work around it and interpret it. (this question also applies for highly correlated variables)

To make it more clear, let's say I have the following dataset:

#Python 

[1. , 0. , 0. , 4.6, 3.1, 1.5, 0.2],
[1. , 0. , 1. , 5. , 3.6, 1.4, 0.2],
[1. , 1. , 0. , 5.4, 3.9, 1.7, 0.4],
[1. , 0. , 1. , 4.6, 3.4, 1.4, 0.3],
[1. , 1. , 0. , 5. , 3.4, 1.5, 0.2],
[1. , 0. , 0. , 4.4, 2.9, 1.4, 0.2],
[1. , 0. , 1. , 4.9, 3.1, 1.5, 0.1],
[1. , 1. , 0. , 5.4, 3.7, 1.5, 0.2],
...

Var_0 = What will be the intercept.

Var_1, var_2 = (2/3 binary dummies), one dropped to avoid co linearity.

Var 3+ = Normal continuous variables

Coefficients:

[ 7.56986405,  0.75703164,  0.27158741, -0.37447474, -2.79926022, 1.43890492, -2.95286947]

logit

[-3.34739217,
 -2.27001103, 
-1.49517926, 
-0.77178644, 
-0.808111, 
-2.48474722, 
-1.76183804, 
-0.90621541
...]

Probabilities

[0.03398066, 
0.09363728, 
0.18314562,
0.31609279, 
0.30829318,
0.0769344 , 
0.14656029, 
0.28777491,
...]

Marginal effect = p*(1-p) * B_j

Now let's say that I am interested in the marginal effect of var_1 (one of the dummies), I will simply do: p*(1-p) * 0.7570

Which will result in an array of length n (# of obs) with different marginal effects (which is fine because I understand that the effects are non constant and non-linear). Let's say this array goes from [0.0008 to 0.0495]

Now the problem is, how can you verify this results? How can I measure the marginal effect when the dummy goes from values 0 to 1?

You could argue that I could do two things MEM and AME methods:

MEM: Leave all the values at its mean and then calculate all over again for var_1 = 0 and then for var_1 = 1 (MEM method)

(you can't really do this because that you will be assuming that you can have
some observations where var_1 and var_2 will be equal to 1 at the same time,
which incorrect since the mean for a dummy is like a proportion of how many "1s"
there are for that column)
AME: Leave as observed, but changing all the values of var_1 to 0 (making all the values of var_2 = 1) and then do the opposite (var_1 = 1, var_2 =0, you have to do this since it can't belong to two categories at the same time), and then take the average of the results (AME method) (Side comment:One thing I am not sure if it is the average between the difference in marginal effects when var_1 = 0 and then 1, or if it is an average between the probabilities when var_1 =0 and then 1, I used both, but probability I think it makes more sense to me)

Now, if I try the 2nd approach I get very different results to what I originally got ( which were values between [0.0008 to 0.0495]), it gives me values between [0.0022 to 0.1207], which is a massive difference.

To summarise:

How can do a mathematical corroboration to get the same values I got initially from the theoretical formula of marginal effect = p* (1-p)* B_j, which was ([0.0008 to 0.0495]). There should be a method to arrive to the same number (I am using AME at the moment)
How can I interpret these original values in the first place? Because if I take 0.0495, I am basically saying, if I increase var_1 by 1-unit (from 0 to 1), I will have a 4.95% increase in probability of my event happening, the problems is that it doesn't consider that to make the 1-unit increase I need to, by default, decrease the other dummy variable (var_2), so I will be doing something of a double-change in the variables or like a double marginal effect at the same time.

Best Answer

For connected dummies $d$ and $x$, you might want to calculate this average of finite differences:

$$AME_x =\frac{1}{N} \cdot \sum_{i=1}^N \left[ \hat p(d=1,x=0,z=z_i)-\hat p(d=0,x=1,z=z_i) \right],$$

where $\hat p(.)$ is the predicted probability from the logit model. I don't know what encoding you are using, so replace that for the ones and zeros above.

Note that std(x)/1000 for continuous variables is not quite right. If you recall the definition of derivatives, the limit of the change goes to zero. You are considering a tiny perturbation, not one of particular size that depends on the SD of x.

Related Solutions

Logit Model Marginal Effects – Interpretation with Logarithmic Variables

You know that in a logit:

$$Pr[y = 1 \vert x,z] = p = \frac{\exp (\alpha + \beta \cdot \ln x + \gamma z)}{1+\exp (\alpha + \beta \cdot \ln x + \gamma z )}. $$

After some tedious calculus and simplification, the partial of that with respect to $x$ becomes:

$$ \frac{\partial Pr[y=1 \vert x,z]}{\partial x} = \frac{\beta}{x} \cdot p \cdot (1-p). $$

This is (sort of) equivalent to

$$\frac{\Delta p}{\Delta x}=\frac{\beta}{x} \cdot p \cdot (1-p),$$

which can be re-written as

$$\frac{\Delta p}{100 \cdot \frac{ \Delta x}{x}}= \frac{\beta \cdot p \cdot (1-p)}{100}.$$

This is the definition of semi-elasticity, and can be interpreted as the change in probability for a 1% change in $x$.

Here's an example in Stata.* Note that I am using margins instead of the out-of-date mfx to get the average marginal effect of $x$, $\frac{1}{N}\Sigma_{i=1}^N\frac{\beta \cdot p_i \cdot (1-p_i)}{100}$:

. sysuse auto, clear
(1978 Automobile Data)

. gen ln_price = ln(price)

. logit foreign ln_price mpg weight, nolog

Logistic regression                             Number of obs     =         74
                                                LR chi2(3)        =      57.69
                                                Prob > chi2       =     0.0000
Log likelihood = -16.185932                     Pseudo R2         =     0.6406

------------------------------------------------------------------------------
     foreign |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    ln_price |   6.851215    2.11763     3.24   0.001     2.700737    11.00169
         mpg |  -.0880842   .1031317    -0.85   0.393    -.2902186    .1140503
      weight |  -.0062268   .0017269    -3.61   0.000    -.0096115   -.0028422
       _cons |  -41.32383   16.24003    -2.54   0.011    -73.15371   -9.493947
------------------------------------------------------------------------------

. margins, expression(_b[ln_price]*predict()*(1-predict())/100)

Predictive margins                              Number of obs     =         74
Model VCE    : OIM

Expression   : _b[ln_price]*predict()*(1-predict())/100

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _cons |   .0046371   .0007965     5.82   0.000      .003076    .0061982
------------------------------------------------------------------------------

This means that for a 1% increase in price, the probability that a car is foreign increases by 0.005 on a [0,1] scale. Or a 10% increase in price gives you a 0.05 increase. In this date, about 0.3 of the cars are foreign, so these are economically and statistically significant.

Edit:

A good way to do this in Stata 10 is to install the user-written command margeff:

. margeff, dydx(ln_price) replace

Average partial effects after margeff
      y  = Pr(foreign) 

------------------------------------------------------------------------------
    variable |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    ln_price |   .4637103   .0796514     5.82   0.000     .3075964    .6198241
         mpg |  -.0059616    .006781    -0.88   0.379    -.0192522     .007329
      weight |  -.0004214   .0000417   -10.11   0.000    -.0005031   -.0003398
------------------------------------------------------------------------------

. lincom _b[ln_price]/100

 ( 1)  .01*ln_price = 0

------------------------------------------------------------------------------
    variable |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |   .0046371   .0007965     5.82   0.000      .003076    .0061982
------------------------------------------------------------------------------

*This is actually not a great empirical example since the relationship in the data has an inverted-U shape.

Solved – How to interpret marginal effects of dumthe variable in logit regression

It is easier to think about interpreting your dichotomous predictors by using the concept of the odds ratio.

Let me give you an example: Imagine you are trying to predict smoking status where our smoking variable is a 1 if you smoke and and 0 if you don't smoke (so a dichotomous outcome and so we can use logistic regression). Now, as in your case, imagine that you have a predictor variable called white where the variable is 1 if you are white or 0 if you are not white. In this example, you can fit a logistic regression model that looks something like this:

$$\text{logit}(p)=\beta_0+\beta_1\times \text{white}$$

And now, lets assume that you get an estimate of $\beta_1=-0.5108256$. Now, converting the estimate onto the odds ratio scale is as simple as exponentiating the parameter estimate, i.e, on the odds ratio scale we have $$e^{\beta_1}=e^{-0.5108256}=0.6$$. And so finally what this tells us is that if you are white, you are expected to be 60% less likely to be a smoker as compared to someone who is not white.

And so to answer your direct question, you wouldn't say that "a 1% increase in being white affect your probability of the dependent variable by x amount", but rather that, you are "y" times more likely to observe the dependent variable given that you are white as compared to not being white.

Best Answer

Related Solutions

Logit Model Marginal Effects – Interpretation with Logarithmic Variables

Solved – How to interpret marginal effects of dumthe variable in logit regression

Related Question