Here's how you might do this. The key step is to make four predictions, keeping the demographics the same, but with all four combinations of treatment and policy indicators. Then you difference the means of the adjusted predictions to get the DID effect. Stata's margins
makes this easy, but could be done by hand.
Here is an example using the famous Card and Krueger minimum wage data, where we adjust for the chain of the fast food restaurant. NJ restaurants make up the treated group and we have a two periods.
Using data that everyone has access to is good. Adjusting your standard errors to reflect that you have panel data is also good (the cluster(id)
option). Using factor variable notation rather than hardcoding interactions also makes this easier. The problem with your approach that Stata is not aware that the variable interaction is related to effdate and aidetype in any way, so margins does not alter the interaction though you change its components. I will do all three in what follows.
Here's the output:
. /* fix sample data */
. use http://fmwww.bc.edu/repec/bocode/c/CardKrueger1994.dta, clear
(Dataset from Card&Krueger (1994))
. drop if id == 407 // duplicate restaurant
(4 observations deleted)
. xtset id t
panel variable: id (strongly balanced)
time variable: t, 0 to 1
delta: 1 unit
. drop if missing(fte)
(19 observations deleted)
. bysort id: keep if _N==2
(19 observations deleted)
.
. /* DID */
. reg fte i.treated##i.t bk kfc roys, cluster(id)
Linear regression Number of obs = 778
F(6, 388) = 42.68
Prob > F = 0.0000
R-squared = 0.1888
Root MSE = 8.2224
(Std. Err. adjusted for 389 clusters in id)
------------------------------------------------------------------------------
| Robust
fte | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
treated |
NJ | -2.395587 1.297017 -1.85 0.066 -4.945647 .1544733
1.t | -2.523333 1.25305 -2.01 0.045 -4.98695 -.0597162
|
treated#t |
NJ#1 | 2.972378 1.337205 2.22 0.027 .343304 5.601452
|
bk | .8513832 1.117792 0.76 0.447 -1.346304 3.04907
kfc | -9.291772 1.075389 -8.64 0.000 -11.40609 -7.177453
roys | -1.051149 1.307334 -0.80 0.422 -3.621495 1.519197
_cons | 21.38843 1.43011 14.96 0.000 18.57669 24.20016
------------------------------------------------------------------------------
. margins, at(t = (0 1) treated = (0 1))
Predictive margins Number of obs = 778
Model VCE : Robust
Expression : Linear prediction, predict()
1._at : treated = 0
t = 0
2._at : treated = 0
t = 1
3._at : treated = 1
t = 0
4._at : treated = 1
t = 1
------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_at |
1 | 19.60145 1.211968 16.17 0.000 17.2186 21.9843
2 | 17.07812 .7967745 21.43 0.000 15.51158 18.64465
3 | 17.20586 .4570743 37.64 0.000 16.30721 18.10452
4 | 17.65491 .4561423 38.70 0.000 16.75809 18.55173
------------------------------------------------------------------------------
. margins t#treated, nopvalues // opaque syntax, but better labeling of output
Predictive margins Number of obs = 778
Model VCE : Robust
Expression : Linear prediction, predict()
--------------------------------------------------------------
| Delta-method
| Margin Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
t#treated |
0#PA | 19.60145 1.211968 17.2186 21.9843
0#NJ | 17.20586 .4570743 16.30721 18.10452
1#PA | 17.07812 .7967745 15.51158 18.64465
1#NJ | 17.65491 .4561423 16.75809 18.55173
--------------------------------------------------------------
. marginsplot // graph the effect
Variables that uniquely identify margins: t treated
. margins r.treated#r.t // calculate DID effect
Contrasts of predictive margins
Model VCE : Robust
Expression : Linear prediction, predict()
------------------------------------------------
| df F P>F
-------------+----------------------------------
treated#t | 1 4.94 0.0268
|
Denominator | 388
------------------------------------------------
----------------------------------------------------------------------
| Delta-method
| Contrast Std. Err. [95% Conf. Interval]
---------------------+------------------------------------------------
treated#t |
(NJ vs PA) (1 vs 0) | 2.972378 1.337205 .343304 5.601452
----------------------------------------------------------------------
.
. /* Replicate adjusted mean for PA at t = 0 */
. gen fte_PA_t0 = _b[_cons] + _b[bk]*bk + _b[kfc]*kfc + _b[roys]*roys
. sum fte_PA_t0
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
fte_PA_t0 | 778 19.60145 3.864352 12.09665 22.23981
.
. /* Check by Hand Using Adjusted Means From Above */
. di "DID is " (17.65491- 17.20586) - (17.07812 - 19.60145)
DID is 2.97238
Stata is smart enough to ignore the at()
assignment for x when you calculate the AME for x (since otherwise you would get a zero). In the end, you have asked Stata to calculate this average of finite differences:
$$AME_x =\sum_{i=1}^N \left[ \hat p(x=1,y=1,z=z_i)-\hat p(x=0,y=1,z=z_i) \right],$$
where $\hat p(.)$ is the predicted probability from the logit model. Stata used differences here rather than derivatives since all your regressors are binary/categorical.
This is probably not a very sensible AME, but perhaps you have your reasons for doing it this way. I am calling this an AME, but it is actually a hybrid of AME and MER (marginal effect at representative values).
Here's a toy example showing the margins calculation by hand:
. sysuse auto, clear
(1978 Automobile Data)
. gen high_mpg = mpg>20
. gen high_rep = rep78>3
. gen heavy = weight>3000
.
. /* AME usig margins */
. logit foreign i.(high_mpg heavy high_rep), nolog
Logistic regression Number of obs = 74
LR chi2(3) = 37.57
Prob > chi2 = 0.0000
Log likelihood = -26.246142 Pseudo R2 = 0.4172
------------------------------------------------------------------------------
foreign | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.high_mpg | -1.118024 1.307539 -0.86 0.393 -3.680754 1.444706
1.heavy | -3.673601 1.417986 -2.59 0.010 -6.452802 -.8944001
1.high_rep | 2.245017 .7705583 2.91 0.004 .7347502 3.755283
_cons | -.2405401 1.332215 -0.18 0.857 -2.851634 2.370554
------------------------------------------------------------------------------
. margins, dydx(high_mpg) at(high_mpg = 1 heavy = 1)
Average marginal effects Number of obs = 74
Model VCE : OIM
Expression : Pr(foreign), predict()
dy/dx w.r.t. : 1.high_mpg
at : high_mpg = 1
heavy = 1
------------------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.high_mpg | -.053257 .0519245 -1.03 0.305 -.155027 .0485131
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.
.
. /* Calculate the same average marginal effect in-sample for high_mpg as above */
. /* (a) ME = phat(high_mpg=1, heavy=1, high_rep at own value) */
. /* - phat(high_mpg=0, heavy=1, high_rep at own value) */
. gen double high_mpg_me =
> ///
> [ exp(_b[_cons]+_b[1.high_mpg]+_b[1.heavy]+_b[1.high_rep]*high_rep)/ ///
> (1+exp(_b[_cons]+_b[1.high_mpg]+_b[1.heavy]+_b[1.high_rep]*high_rep))] ///
> -[ exp(_b[_cons] +_b[1.heavy]+_b[1.high_rep]*high_rep)/ ///
> (1+exp(_b[_cons] +_b[1.heavy]+_b[1.high_rep]*high_rep))]
.
. /* (b) Calculate the average marginal effect (AME) */
. sum high_mpg_me, meanonly
. display "High MPG AME = " %9.6f r(mean)
High MPG AME = -0.053257
According to this model, when all cars are assumed to be heavy, but have their actual in-sample values of high repair record as they are observed. the probability of the car being foreign falls by 5.3 percentage points when it is high MPG (relative to low MPG).
Stata Code:
cls
sysuse auto, clear
gen high_mpg = mpg>20
gen high_rep = rep78>3
gen heavy = weight>3000
/* AME usig margins */
logit foreign i.(high_mpg heavy high_rep), nolog
margins, dydx(high_mpg) at(high_mpg = 1 heavy = 1)
/* Calculate the same average marginal effect in-sample for high_mpg as above */
/* (a) ME = phat(high_mpg=1, heavy=1, high_rep at own value) */
/* - phat(high_mpg=0, heavy=1, high_rep at own value) */
gen double high_mpg_me = ///
[ exp(_b[_cons]+_b[1.high_mpg]+_b[1.heavy]+_b[1.high_rep]*high_rep)/ ///
(1+exp(_b[_cons]+_b[1.high_mpg]+_b[1.heavy]+_b[1.high_rep]*high_rep))] ///
-[ exp(_b[_cons] +_b[1.heavy]+_b[1.high_rep]*high_rep)/ ///
(1+exp(_b[_cons] +_b[1.heavy]+_b[1.high_rep]*high_rep))]
/* (b) Calculate the average marginal effect (AME) */
sum high_mpg_me, meanonly
di "High MPG AME = " %9.6f r(mean)
Best Answer
Stack the data from the two time periods, as you have done, but don't run them separately for the time periods. Use a dummy for time, and interaction terms as appropriate. Try this:
This will tell you if period is significant, and if it moderates y's effect on x. You can then run your margins statement appropriately. You also have to be careful in interpreting interaction terms in logit models, because of their nonlinearity. See these references for a detailed explanation:
Norton, E. C., Wang, H., & Ai, C. (2004). Computing interaction effects and standard errors in logit and probit models. Stata Journal, 4, 154-167.
This is kind of a contentious area and a bit has been written since 2004, though, so you should do more digging. I do believe the current implementation of
margins
in Stata takes care of this for you, but it would be good to be aware of the issues.One other comment, for nonlinear models it can be dangerous to compare coefficients across separate samples. Logit models are sensitive to differences in the dispersion of the underlying latent variable, so if the dispersion or variance is different across the datasets, you may not get valid comparisons of coefficients. This isn't typically a concern with linear regression, but it is in a logit model. See this paper if you have access to Sage journals - if not, reading the abstract may be sufficient to understand it's a problem: Karlson, K. B., Holm, A., & Breen, R. (2012). Comparing Regression Coefficients Between Same-sample Nested Models Using Logit and Probit A New Method. Sociological Methodology, 42(1), 286-313.