Solved – version of multivariate multinomial logit

categorical datamultinomial-distributionmultivariate analysis

I'm working with a data set with 2-3 response variables and 7 predictor variables. All the variables are categorical. If there were just one response variable, I think a multinomial logit would be the right model, but there are 2 or 3. So my question is – is there a multivariate version of the multinomial logit?

I've looked at several books on categorical data, but haven't seen anything like this (mainly using Agresti 2002).

I have about 2000 observations, though I'll probably need to split it up into 2 or 3 data subsets to really see what's going on. One thing I was thinking about is converting it to counts and use a model for count data. I could also combine the 2-3 response vars into 1 categorical with a lot of categories, but I think that will lower the chances of anything showing up for any of the categories. I could also do 2-3 separate models, one for each variable, which is obviously not as good.

I might also be able to get rid of some of the predictors (I think 3 of the 7 have the most explanatory power). I'm not opposed to using machine learning methods, I've found some interesting stuff already with decision trees.

thanks,

-paul

Best Answer

Agresti 2007 discusses them. They're in chapter 9 and 10. The 2002 edition probably discusses them too, as @suncoolsu mentioned.

Agresti refers to the group of response variables as a cluster and discusses according analysis with marginal models, conditional models and generalized estimating equations.

Related Question