I'm the creator of pylogit
. Thanks for using my package!
To answer your question, the differences in estimation results comes from differences in the way choice data is represented in statsmodels
versus mlogit
. The TravelMode
dataset is in long format natively (i.e. when you wrote it to csv). I.e. it has one row per alternative per observation. However, statsmodels
assumes ones data is in wide format, with one row per observation. Thus there are 4 times as many "observations" in the statsmodels
model (840) than there are in your mlogit
model. Try calling length(unique(TravelMode$individual))
.
Moreover, your variables (pd.get_dummies(TM['mode'])
) in your statsmodel
model do not represent alternative specific constants, but are simply the alternative identifiers for each row of the long-format data. This is in contrast to the intercepts that were estimated in the mlogit
model.
Stata has the margins
command that makes this as easy as pie to get elasticities for continuous variables (% change in probability of each outcome for a % change in x) and semi-elasticities for dummy variables (% change in probability of each outcome when x goes from 0 to 1). mfx
has been superseded by margins
.
Having said that, I find these elasticities hard to interpret and prefer to look at the change in probability for a % change in x or unit change in x. I will do that below as well, and try to connect the dots between the two ways of doing this.
Here's example of Stata code with interpretation for what you want:
. webuse sysdsn1
(Health insurance data)
. mlogit insure age i.(male nonwhite site), nolog
Multinomial logistic regression Number of obs = 615
LR chi2(10) = 42.99
Prob > chi2 = 0.0000
Log likelihood = -534.36165 Pseudo R2 = 0.0387
------------------------------------------------------------------------------
insure | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Indemnity | (base outcome)
-------------+----------------------------------------------------------------
Prepaid |
age | -.011745 .0061946 -1.90 0.058 -.0238862 .0003962
1.male | .5616934 .2027465 2.77 0.006 .1643175 .9590693
1.nonwhite | .9747768 .2363213 4.12 0.000 .5115955 1.437958
|
site |
2 | .1130359 .2101903 0.54 0.591 -.2989296 .5250013
3 | -.5879879 .2279351 -2.58 0.010 -1.034733 -.1412433
|
_cons | .2697127 .3284422 0.82 0.412 -.3740222 .9134476
-------------+----------------------------------------------------------------
Uninsure |
age | -.0077961 .0114418 -0.68 0.496 -.0302217 .0146294
1.male | .4518496 .3674867 1.23 0.219 -.268411 1.17211
1.nonwhite | .2170589 .4256361 0.51 0.610 -.6171725 1.05129
|
site |
2 | -1.211563 .4705127 -2.57 0.010 -2.133751 -.2893747
3 | -.2078123 .3662926 -0.57 0.570 -.9257327 .510108
|
_cons | -1.286943 .5923219 -2.17 0.030 -2.447872 -.1260134
------------------------------------------------------------------------------
. margins, eyex(age) predict(outcome(1))
Average marginal effects Number of obs = 615
Model VCE : OIM
Expression : Pr(insure==Indemnity), predict(outcome(1))
ey/ex w.r.t. : age
------------------------------------------------------------------------------
| Delta-method
| ey/ex Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | .2560951 .1330162 1.93 0.054 -.0046119 .5168021
------------------------------------------------------------------------------
. margins, eyex(age)
Average marginal effects Number of obs = 615
Model VCE : OIM
ey/ex w.r.t. : age
1._predict : Pr(insure==Indemnity), predict(pr outcome(1))
2._predict : Pr(insure==Prepaid), predict(pr outcome(2))
3._predict : Pr(insure==Uninsure), predict(pr outcome(3))
------------------------------------------------------------------------------
| Delta-method
| ey/ex Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age |
_predict |
1 | .2560951 .1330162 1.93 0.054 -.0046119 .5168021
2 | -.2661848 .152332 -1.75 0.081 -.5647501 .0323805
3 | -.090586 .4576158 -0.20 0.843 -.9874964 .8063244
------------------------------------------------------------------------------
. margins, eydx(male)
Average marginal effects Number of obs = 615
Model VCE : OIM
ey/dx w.r.t. : 1.male
1._predict : Pr(insure==Indemnity), predict(pr outcome(1))
2._predict : Pr(insure==Prepaid), predict(pr outcome(2))
3._predict : Pr(insure==Uninsure), predict(pr outcome(3))
------------------------------------------------------------------------------
| Delta-method
| ey/dx Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.male |
_predict |
1 | -.3040742 .115607 -2.63 0.009 -.5306598 -.0774886
2 | .2576192 .0958708 2.69 0.007 .0697159 .4455226
3 | .1477754 .3243217 0.46 0.649 -.4878834 .7834342
------------------------------------------------------------------------------
Note: ey/dx for factor levels is the discrete change from the base level.
For interpretation, a 1% increase in age gives you .25% increase in the probability of choosing indemnity insurance (inelastic). Being male lowers the that probability by almost a third.
Personally, I find this much easier to interpret:
. margins, dyex(age) predict(outcome(1))
Average marginal effects Number of obs = 615
Model VCE : OIM
Expression : Pr(insure==Indemnity), predict(outcome(1))
dy/ex w.r.t. : age
------------------------------------------------------------------------------
| Delta-method
| dy/ex Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | .118398 .0622018 1.90 0.057 -.0035153 .2403113
------------------------------------------------------------------------------
. margins, dydx(age) predict(outcome(1))
Average marginal effects Number of obs = 615
Model VCE : OIM
Expression : Pr(insure==Indemnity), predict(outcome(1))
dy/dx w.r.t. : age
------------------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | .0026655 .001399 1.91 0.057 -.0000765 .0054074
------------------------------------------------------------------------------
This says that 1% increase in age boost the probability of indemnity insurance by 11.8 percentage points and being a year older would give you a 2/10 of percentage point boost.
Let make sure the former agrees with what we did before. The baseline for indemnity is:
. margins, predict(outcome(1))
Predictive margins Number of obs = 615
Model VCE : OIM
Expression : Pr(insure==Indemnity), predict(outcome(1))
------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_cons | .4764228 .0197054 24.18 0.000 .437801 .5150446
------------------------------------------------------------------------------
An 11.8 percentage point increase expressed as percentage change over the baseline is .118/.476=.24789916, which is pretty close to 0.25.
Best Answer
You might consider using a dominance analysis instead (see Luchman, 2014). This approach would decompose the McFadden's R2 and might, as a consequence, be easier to interpret than the utility-based approach discussed here.
That would avoid the 4th categorical value issue and would scale to 100 if standardized based on the McFadden R2.
The dominance method is implemented in the R package relaimpo, but has not been extended to accommodate mlogits. The only version I know of that does is in Stata. That said, dominance analysis is not a terribly difficult method to implement.
To do so, collect all McFadden's R2's according to each possible subset including or omitting each predictor (in your case 2^5 - 1 different models) and average them as in the linked article (note: the gives some examples along which to follow).