Logistic – How to Calculate Marginal Effect by Hand with Logit and Dummy Variables

logisticmarginal-effect

I have the following dilemma:

I understand-ish what marginal effects are, also the calculation of it, derivation of the sigmoid function and how to interpret it (as a the change in probability by increasing your variable of interest by "a little bit", this little bit being 1 for discrete vars or by a std(x)/1000 for continuous ). Now, the part I find tricky is to corroborate the results of the marginal effects by hand and recalculating the probabilities for x=0 and then x=1 (for example) and then get a difference in probability equal to the marginal effect I got earlier, I am particularly stuck with dummy variables since If I increase one, I have to decrease the other one, so I am not so sure how to work around it and interpret it. (this question also applies for highly correlated variables)

To make it more clear, let's say I have the following dataset:

#Python 

[1. , 0. , 0. , 4.6, 3.1, 1.5, 0.2],
[1. , 0. , 1. , 5. , 3.6, 1.4, 0.2],
[1. , 1. , 0. , 5.4, 3.9, 1.7, 0.4],
[1. , 0. , 1. , 4.6, 3.4, 1.4, 0.3],
[1. , 1. , 0. , 5. , 3.4, 1.5, 0.2],
[1. , 0. , 0. , 4.4, 2.9, 1.4, 0.2],
[1. , 0. , 1. , 4.9, 3.1, 1.5, 0.1],
[1. , 1. , 0. , 5.4, 3.7, 1.5, 0.2],
...

Var_0 = What will be the intercept.

Var_1, var_2 = (2/3 binary dummies), one dropped to avoid co linearity.

Var 3+ = Normal continuous variables

Coefficients:

[ 7.56986405,  0.75703164,  0.27158741, -0.37447474, -2.79926022, 1.43890492, -2.95286947]

logit

[-3.34739217,
 -2.27001103, 
-1.49517926, 
-0.77178644, 
-0.808111, 
-2.48474722, 
-1.76183804, 
-0.90621541
...]

Probabilities

[0.03398066, 
0.09363728, 
0.18314562,
0.31609279, 
0.30829318,
0.0769344 , 
0.14656029, 
0.28777491,
...]

Marginal effect = p*(1-p) * B_j

Now let's say that I am interested in the marginal effect of var_1 (one of the dummies), I will simply do: p*(1-p) * 0.7570

Which will result in an array of length n (# of obs) with different marginal effects (which is fine because I understand that the effects are non constant and non-linear). Let's say this array goes from [0.0008 to 0.0495]

Now the problem is, how can you verify this results? How can I measure the marginal effect when the dummy goes from values 0 to 1?

You could argue that I could do two things MEM and AME methods:

  1. MEM: Leave all the values at its mean and then calculate all over again for var_1 = 0 and then for var_1 = 1 (MEM method)

    (you can't really do this because that you will be assuming that you can have
    some observations where var_1 and var_2 will be equal to 1 at the same time,
    which incorrect since the mean for a dummy is like a proportion of how many "1s"
    there are for that column)

  2. AME: Leave as observed, but changing all the values of var_1 to 0 (making all the values of var_2 = 1) and then do the opposite (var_1 = 1, var_2 =0, you have to do this since it can't belong to two categories at the same time), and then take the average of the results (AME method) (Side comment:One thing I am not sure if it is the average between the difference in marginal effects when var_1 = 0 and then 1, or if it is an average between the probabilities when var_1 =0 and then 1, I used both, but probability I think it makes more sense to me)

Now, if I try the 2nd approach I get very different results to what I originally got ( which were values between [0.0008 to 0.0495]), it gives me values between [0.0022 to 0.1207], which is a massive difference.

To summarise:

  1. How can do a mathematical corroboration to get the same values I got initially from the theoretical formula of marginal effect = p* (1-p)* B_j, which was ([0.0008 to 0.0495]). There should be a method to arrive to the same number (I am using AME at the moment)

  2. How can I interpret these original values in the first place? Because if I take 0.0495, I am basically saying, if I increase var_1 by 1-unit (from 0 to 1), I will have a 4.95% increase in probability of my event happening, the problems is that it doesn't consider that to make the 1-unit increase I need to, by default, decrease the other dummy variable (var_2), so I will be doing something of a double-change in the variables or like a double marginal effect at the same time.

Best Answer

For connected dummies $d$ and $x$, you might want to calculate this average of finite differences:

$$AME_x =\frac{1}{N} \cdot \sum_{i=1}^N \left[ \hat p(d=1,x=0,z=z_i)-\hat p(d=0,x=1,z=z_i) \right],$$

where $\hat p(.)$ is the predicted probability from the logit model. I don't know what encoding you are using, so replace that for the ones and zeros above.

Note that std(x)/1000 for continuous variables is not quite right. If you recall the definition of derivatives, the limit of the change goes to zero. You are considering a tiny perturbation, not one of particular size that depends on the SD of x.