It is my understanding that in a mixed logit model there can be two types of variables, alternative specific and individual specific. For example, in a dataset for choices of fishing modes like this (long format):
id altern price catch income choice
1 beach 157.93 0.0678 7083 0
1 boat 157.93 0.2601 7083 0
1 charter 182.93 0.5391 7083 1
1 pier 157.93 0.0503 7083 0
2 beach 15.114 0.1049 1250 0
2 boat 10.534 0.1574 1250 0
2 charter 34.532 0.4671 1250 1
2 pier 15.112 0.0451 1250 0
https://cran.r-project.org/web/packages/mlogit/mlogit.pdf
the variables price
and catch
are alternative specific (they vary across alternatives) and the variable income
is individual specific (do not vary across alternatives). Most of the examples of mixed logit that I have seen use random parameters only for alternative specific variables (R example, Stata example).
Is it possible or does it make sense to use random parameters for individual specific variables? For example could I use a random parameter in a variable such as "income" in the example above?
Based on the very helpful comment of @RobertLong below, it is clear that since there is no variation in income at the subject level, the random effects will not be identified and there will be either convergence problems or singular fit.
However, for income
we will have 3 coefficients: boat:income
, charter:income
and pier:income
(assuming beach as base alternative).
In the following R code, I show how it is possible to estimate a random parameter for boat:income
for example. However, I am not sure if this random parameter can be interpreted as the heterogeneous effect of people's income on the choice of boat. Or in other words, is such random parameter meaningful at all?
library(mlogit)
data("Fishing", package = "mlogit")
Fish <- mlogit.data(Fishing, varying = c(2:9), shape = "wide", choice = "mode")
m <- mlogit(mode ~ price+catch | income , data = Fish,
rpar = c("boat:income" = "n",price = "n",catch = "n"))
summary(m)
Output:
Call:
mlogit(formula = mode ~ price + catch | income, data = Fish,
rpar = c(price = "n", catch = "n", `boat:income` = "n"))
Frequencies of alternatives:
beach boat charter pier
0.11337 0.35364 0.38240 0.15059
bfgs method
7 iterations, 0h:0m:18s
g'(-H)^-1g = 5.67E+04
last step couldn't find higher value
Coefficients :
Estimate Std. Error z-value Pr(>|z|)
boat:(intercept) 1.0546e+01 3.0995e-01 34.0263 < 2.2e-16 ***
charter:(intercept) 3.5411e+00 2.7998e-01 12.6478 < 2.2e-16 ***
pier:(intercept) 1.0369e+00 2.1322e-01 4.8631 1.155e-06 ***
price -5.3412e-02 9.2995e-04 -57.4354 < 2.2e-16 ***
catch 4.3868e+00 2.4274e-01 18.0721 < 2.2e-16 ***
boat:income 3.1719e-04 2.5933e-05 12.2309 < 2.2e-16 ***
charter:income -1.6993e-04 6.0929e-05 -2.7891 0.005286 **
pier:income -1.1019e-04 4.7761e-05 -2.3070 0.021054 *
sd.price 1.8463e-02 1.8222e-03 10.1325 < 2.2e-16 ***
sd.catch -3.6970e+00 4.8660e-01 -7.5975 3.020e-14 ***
sd.boat:income 1.0028e-01 1.1416e-04 878.4387 < 2.2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Log-Likelihood: -1506.7
McFadden R^2: -0.0060168
Likelihood ratio test : chisq = -18.023 (p.value = 1)
random coefficients
Min. 1st Qu. Median Mean 3rd Qu. Max.
price -Inf -0.06586500 -0.0534118074 -0.0534118074 -0.04095861 Inf
catch -Inf 1.89328020 4.3868433213 4.3868433213 6.88040644 Inf
boat:income -Inf -0.06732334 0.0003171871 0.0003171871 0.06795772 Inf
Best Answer
In a MNL model, the only way you can include individual-specific variables (e.g., individual's income) in the model is by specifying additional interaction effects between the personal characteristics (income) and the alternative-specific variables of interest (e.g. price). Such interaction effects can for example be used to test whether men have a higher price sensitivity than women. It is technically possible to specify this interaction effect as a random effect - For instance, you could assume that men differ in their price sensitivity and then you could assume that the interaction effect between price and gender follows a Normal distribution with mean and variance to be estimated. However such model can be difficult to estimate (depending on the amount of data). If you are interested in exploring the effect of a single personal characteristic (eg gender), might be best to easier to consider a subgroup analysis (split sample).