I have 2 research questions, dependent variable of 1st question is a binary variable and I use logit regression to estimate it on STATA. The dependent variable of second research question is in percentage (proportional variable, between 0 and 1 including 0 and 1). I use fractional response regression on Stata. In both research questions, my key explanatory variable is expressed in percentage. I am a bit confused and need help in interpreting the coefficient.
I have read that if both dependent and independent variable is expressed in percentage, then we can interpret the coefficient as percentage point. i estimated the margins after fractional regression using margins dydx command on STATA, if the coefficient is -0.5, then we interpret it as with one percentage point increase in X, Y decreases by 0.5 percentage points. Am I correct?
Secondly, in the case of logit regression, where dependent variable is a binary variable 0 or 1, I am predicting likelihood of an event. So, 1 = if event occurs and 0 otherwise. I am interested in interpreting the margins in percentage points. I estimated the margins after logit regression using margins dydx command on Stata. The coefficient is -0.88. How do I interpret this coefficient? Is it correct to say with a percentage point increase in X, the likelihood of event reduces by 0.88 percentage points? I am confused if it is to be interpreted as 88 percentage points or 0.88 percentage points?
Can someone explain this in simple terms?
Best Answer
In both of these models, your outcome and explanatory variable of interest lie in [0,1]. When you calculate the average marginal effect, you are getting the average change in the outcome associated with a 1-unit increase in the explanatory variable. A one-unit change in X is very large if X is between 0 and 1 and may not even make sense: what does going from 0.8 to 1.8 even mean when 1 is the max?
This means you might want to either
Here is an example where the outcome is the participation rate in the 401(k) plan at 4,075 firms in the US. This is a type of retirement savings account in the United States where employees have to opt in or can opt-out if enrollment is automatic.
The explanatory variable of interest is the employer match rate per dollar saved by the employee: 0.5 means the match is 50 cents for every dollar saved, 0 means the employer does not match anything, and 1 means 1:1 match, a doubling. Values above 1 are possible here.
You would expect to see a higher enrollment rate in firms where the employers match more generously. That is exactly what we see:
The interpretation of the first
margins
is that the participation rate increases by 0.145 (14.5 percentage points) when the match rate increases by 1. Since the baseline is 0.84, that basically means everyone is expected to participate after the change. But that's a very large change since the mean match rate is 0.46, and the max is 2. Less than 5% of firms match at that new rate.I next implement the two approaches I suggested. If we consider a one-penny increase in the match rate, then participation only increases by 0.00145, or 1/10th of 1 percentage point. Equivalently, an increase of 10 cents is associated with a 1.45 percentage point higher participation. Generally, you would expect all of these to be close, but that not always the case when things are very nonlinear.
The logit case is identical to the fractional regression, so I will omit that.