Solved – Logit – comparison of predicted probabilities

logisticlogitmarginal-effectregression coefficientsstata

I am analyzing, for two different time periods, the probability that an individual will have outcome Y (=1 or 0) given that an event X has occurred (=1 or 0). A number of demographic variables are also included, such as age, gender, etc.

I am interested in looking at very specific cases (and interpreting the results): I look at differences in Y given X holding all other variables at specific values,and do this for time period=0 and then for time period=1.

I would like to know if there is a way to compare the results along these two time periods. The models have the same variables.

I am using Stata 13 and my code looks similar to this:

svy: logit y x##c.age x##female if period==0
margins, level(90) at(age=50 female=1 x=(0 1))

Then I do the same estimation for period == 1:

svy: logit y x##c.age x##female if period==1
margins, level(90) at(age=26 female=0 x=(0 1))

Best Answer

Stack the data from the two time periods, as you have done, but don't run them separately for the time periods. Use a dummy for time, and interaction terms as appropriate. Try this:

svy: logit y x##c.age x##female x##period 

This will tell you if period is significant, and if it moderates y's effect on x. You can then run your margins statement appropriately. You also have to be careful in interpreting interaction terms in logit models, because of their nonlinearity. See these references for a detailed explanation:

Norton, E. C., Wang, H., & Ai, C. (2004). Computing interaction effects and standard errors in logit and probit models. Stata Journal, 4, 154-167.

This is kind of a contentious area and a bit has been written since 2004, though, so you should do more digging. I do believe the current implementation of margins in Stata takes care of this for you, but it would be good to be aware of the issues.

One other comment, for nonlinear models it can be dangerous to compare coefficients across separate samples. Logit models are sensitive to differences in the dispersion of the underlying latent variable, so if the dispersion or variance is different across the datasets, you may not get valid comparisons of coefficients. This isn't typically a concern with linear regression, but it is in a logit model. See this paper if you have access to Sage journals - if not, reading the abstract may be sufficient to understand it's a problem: Karlson, K. B., Holm, A., & Breen, R. (2012). Comparing Regression Coefficients Between Same-sample Nested Models Using Logit and Probit A New Method. Sociological Methodology, 42(1), 286-313.

Related Question