Solved – StatsModel Logistic Regression

logitpythonstatsmodels

I am running a fairly simple Logistic Regression model y= (1[Positive Savings] ,0]) X = (1[Treated Group],0)
I got a coefficient of Treated -.64 and OR of .52.
My question is how to interpret the meaning of the coefficient?

Is y base 1 and X base 0

My thoughts are that the treatment X 0 is .47% less likely to show positive savings?

Is it always 0 being the base in the binary or categorical?

can I get stats model to give 0- 2 or 0-3 as Odds Ratio as well?

Best Answer

In your model:

$$ y \sim Binomial(n, p) $$ $$logit(p) = \beta_0 + \beta_1 x $$

you get: $$ log{p \over{1-p}} = \beta_0 + \beta_1 x $$ $$ log{~O_{y|x}} = \beta_0 + \beta_1 x $$ and solving for $\beta_1$ gives you: $$ \beta_1 = (\beta_0 + \beta_1) - \beta_0 $$ $$ ~~~~~~~~~~~~~\beta_1 = log{~O_{y|x=1}} - log{~O_{y|x=0}} $$ $$\beta_1 = log{~O_{y|x=1} \over ~O_{y|x=0} } $$ and finally: $$exp(\beta_1) = {O_{treatment} \over O_{control}} $$

Since your OR is in fact $exp(-.64) = 0.53$, you can convert this to a percentage via $(exp(\beta_1)-1) \times 100 = -47$% and conclude that:

The average probability of getting positive savings is 47% lower at level "treatment" than level "control".

If independent variable $x$ were continuous you would say:

The average probability of getting positive savings gets 47% lower for every unit increase in $x$.