I performed logit regression on my variables. I have 10 variables, all are categorical varibales. After performing logit, I want to check for robustness. How do I got about it in STATA
Solved – how to check for robustness for categorical variables in Stata
logitstata
Related Solutions
Given your comments I will assume that you do not want an estimate of the size of the effect but instead a statistical test whether the expected (possibly adjusted) count for each of the categories are the same. This may or may not be wise depending on your circumstances, but this is an example of how you do it in Stata:
webuse dollhill3
poisson deaths smokes i.agecat, exposure(pyears)
testparm i.agecat
If you want something like a single effect size you could look into sheaf coefficients. In case of interaction terms this generalizes to a model with parametrically weighted covariates. A brief discussion on how to do those in Stata can be found here.
Stata is smart enough to ignore the at()
assignment for x when you calculate the AME for x (since otherwise you would get a zero). In the end, you have asked Stata to calculate this average of finite differences:
$$AME_x =\sum_{i=1}^N \left[ \hat p(x=1,y=1,z=z_i)-\hat p(x=0,y=1,z=z_i) \right],$$
where $\hat p(.)$ is the predicted probability from the logit model. Stata used differences here rather than derivatives since all your regressors are binary/categorical.
This is probably not a very sensible AME, but perhaps you have your reasons for doing it this way. I am calling this an AME, but it is actually a hybrid of AME and MER (marginal effect at representative values).
Here's a toy example showing the margins calculation by hand:
. sysuse auto, clear
(1978 Automobile Data)
. gen high_mpg = mpg>20
. gen high_rep = rep78>3
. gen heavy = weight>3000
.
. /* AME usig margins */
. logit foreign i.(high_mpg heavy high_rep), nolog
Logistic regression Number of obs = 74
LR chi2(3) = 37.57
Prob > chi2 = 0.0000
Log likelihood = -26.246142 Pseudo R2 = 0.4172
------------------------------------------------------------------------------
foreign | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.high_mpg | -1.118024 1.307539 -0.86 0.393 -3.680754 1.444706
1.heavy | -3.673601 1.417986 -2.59 0.010 -6.452802 -.8944001
1.high_rep | 2.245017 .7705583 2.91 0.004 .7347502 3.755283
_cons | -.2405401 1.332215 -0.18 0.857 -2.851634 2.370554
------------------------------------------------------------------------------
. margins, dydx(high_mpg) at(high_mpg = 1 heavy = 1)
Average marginal effects Number of obs = 74
Model VCE : OIM
Expression : Pr(foreign), predict()
dy/dx w.r.t. : 1.high_mpg
at : high_mpg = 1
heavy = 1
------------------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.high_mpg | -.053257 .0519245 -1.03 0.305 -.155027 .0485131
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.
.
. /* Calculate the same average marginal effect in-sample for high_mpg as above */
. /* (a) ME = phat(high_mpg=1, heavy=1, high_rep at own value) */
. /* - phat(high_mpg=0, heavy=1, high_rep at own value) */
. gen double high_mpg_me =
> ///
> [ exp(_b[_cons]+_b[1.high_mpg]+_b[1.heavy]+_b[1.high_rep]*high_rep)/ ///
> (1+exp(_b[_cons]+_b[1.high_mpg]+_b[1.heavy]+_b[1.high_rep]*high_rep))] ///
> -[ exp(_b[_cons] +_b[1.heavy]+_b[1.high_rep]*high_rep)/ ///
> (1+exp(_b[_cons] +_b[1.heavy]+_b[1.high_rep]*high_rep))]
.
. /* (b) Calculate the average marginal effect (AME) */
. sum high_mpg_me, meanonly
. display "High MPG AME = " %9.6f r(mean)
High MPG AME = -0.053257
According to this model, when all cars are assumed to be heavy, but have their actual in-sample values of high repair record as they are observed. the probability of the car being foreign falls by 5.3 percentage points when it is high MPG (relative to low MPG).
Stata Code:
cls
sysuse auto, clear
gen high_mpg = mpg>20
gen high_rep = rep78>3
gen heavy = weight>3000
/* AME usig margins */
logit foreign i.(high_mpg heavy high_rep), nolog
margins, dydx(high_mpg) at(high_mpg = 1 heavy = 1)
/* Calculate the same average marginal effect in-sample for high_mpg as above */
/* (a) ME = phat(high_mpg=1, heavy=1, high_rep at own value) */
/* - phat(high_mpg=0, heavy=1, high_rep at own value) */
gen double high_mpg_me = ///
[ exp(_b[_cons]+_b[1.high_mpg]+_b[1.heavy]+_b[1.high_rep]*high_rep)/ ///
(1+exp(_b[_cons]+_b[1.high_mpg]+_b[1.heavy]+_b[1.high_rep]*high_rep))] ///
-[ exp(_b[_cons] +_b[1.heavy]+_b[1.high_rep]*high_rep)/ ///
(1+exp(_b[_cons] +_b[1.heavy]+_b[1.high_rep]*high_rep))]
/* (b) Calculate the average marginal effect (AME) */
sum high_mpg_me, meanonly
di "High MPG AME = " %9.6f r(mean)
Best Answer
In general, what econometricians refer to as a "robustness check" is a check on the change of some coefficients when we add or drop covariates. In linear regression models, this is pretty easy.
However, in a logit (or another non-linear probability model), it's actually quite hard because the coefficients change size with the total amount of variation explained in the model.
A solution for this was proposed by the sociologists Holm, Karlson & Breen in SMx 2012, SMR 2013. It is implemented in Stata via the khb command.