Solved – How to interpret marginal effects of dumthe variable in logit regression

categorical datalogisticmarginal-effectmultiple regressionregression

For a project, I ran a logistic regression using continuous and dichotomous variables. How do I interpret the marginal effects of a dichotomous variable?

For example, one of our independent variables that has a binary outcome is "White", as in belonging to the Caucasian race. Our dependent variable also has a binary outcome (hence the use of the logit model) so our our outcomes are expressed in probabilities. So to interpret the marginal effect of being white on our outcome, would it be something like " a 1% increase in being white affect your probability of the dependent variable by x amount " ?

Any comments or suggestions welcome 🙂

Best Answer

It is easier to think about interpreting your dichotomous predictors by using the concept of the odds ratio.

Let me give you an example: Imagine you are trying to predict smoking status where our smoking variable is a 1 if you smoke and and 0 if you don't smoke (so a dichotomous outcome and so we can use logistic regression). Now, as in your case, imagine that you have a predictor variable called white where the variable is 1 if you are white or 0 if you are not white. In this example, you can fit a logistic regression model that looks something like this:

$$\text{logit}(p)=\beta_0+\beta_1\times \text{white}$$

And now, lets assume that you get an estimate of $\beta_1=-0.5108256$. Now, converting the estimate onto the odds ratio scale is as simple as exponentiating the parameter estimate, i.e, on the odds ratio scale we have $$e^{\beta_1}=e^{-0.5108256}=0.6$$. And so finally what this tells us is that if you are white, you are expected to be 60% less likely to be a smoker as compared to someone who is not white.

And so to answer your direct question, you wouldn't say that "a 1% increase in being white affect your probability of the dependent variable by x amount", but rather that, you are "y" times more likely to observe the dependent variable given that you are white as compared to not being white.

Related Solutions

Solved – Presenting marginal effects of logit with fixed effects

In R the effects package can easily help with interpreting such coefficients by producing the appropriate graphs. From CRAN:

effects: Effect Displays for Linear, Generalized Linear, and Other Models

Graphical and tabular effect displays, e.g., of interactions, for various statistical models with linear predictors.

The package comes with two papers by John Fox describing the use of the package (along with examples) and the underlying implementation. The package takes good care of dealing with the subtle details and implementation difficulties for the supported models.

As an example:

require(car)
require(effects)
data(Cowles)
cowles.mod <- glm(volunteer ~ sex + neuroticism*extraversion,
                  data=Cowles, family=binomial)
Anova(cowles.mod)
plot(effect("neuroticism*extraversion", cowles.mod))
plot(effect("sex", cowles.mod))  ##for dummy, but should be similar for factors

Which will yield:

> Anova(cowles.mod)
Analysis of Deviance Table (Type II tests)

Response: volunteer
                         LR Chisq Df Pr(>Chisq)    
sex                        4.9184  1   0.026572 *  
neuroticism                0.3139  1   0.575316    
extraversion              22.1372  1  2.538e-06 ***
neuroticism:extraversion   8.6213  1   0.003323 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

And for the graph, which displays the predicted values of Y on the vertical axis, and the effect of X ("neuroticism") on Y given various levels of Z ("extraversion"):

enter image description here

Here's the effect plot for the dummy (but it should work similarly for factors):

enter image description here

Lastly, there seems to be some confusion surrounding partial effects (holding other predictors constant) vs marginal effects (ignoring other predictors). In my understanding the effects package is concerned with displaying "partial effects".

Solved – Multiple Regression – Minimum Observations Per Dumthe Variable

The question is "minimum number of observations to do what?". If the objective is to find the min number of observations needed to detect a significant effect of a dummy (when the effect truly exists), then you need to know what might be the effect size and then perform standard power analysis. If you just want to know how many observations you need to run your model, then it is very likely that the model will run with a very small number of observations per variables ( say less than 10) as long as they are not too correlated.

Best Answer

Related Solutions

Solved – Presenting marginal effects of logit with fixed effects

Solved – Multiple Regression – Minimum Observations Per Dumthe Variable

Related Question