Multilevel-Model – Analyzing Multilevel Models with Responses Only at Level 2

logisticmixed modelmultilevel-analysis

I have hierarchical data of individuals nested into families. For each individual, I have independent variables such as age, gender, education, and familiarity with product. For each family unit, I also have covariates such as household income, purchase behavior, and distance to retail centers.

The dependent satisfaction measure is only recorded at the family level. More specifically, satisfaction is asked of a head-of-household respondent, who ideally represents the household. While satisfaction is measured on a 5-point scale, we typically re-express it as dichotomous (top 2 box).

I would like to take into consideration the individual-level effects as well as the family-level effects in modeling product satisfaction propensity. Is it appropriate to explore multilevel modeling when the outcome is only measured at the second level? If not, is there a different approach I should be following?

Best Answer

A nice paper about this is the following:

Basically the approach that they outline involves computing adjusted group means on the predictor variables and then regressing the outcome on the adjusted group means. The adjusted group mean for each group is the best linear unbiased predictor (BLUP) of the predictor variable for that group. You can compute those using equations given in the paper or, if you're using R, using the lme4 package and its coef() function.

Edit 2020-09-19:

Since writing this answer in 2015, I've become convinced that the Croon & van Veldhoven (CvV) procedure that I mentioned above is not actually the best way to address this issue. In fact, the intuitive approach of simply aggregating the predictors up to the group level and then doing OLS of the group outcomes on these (unadjusted) predictor group means seems to work just as well, if not better. These two methods are compared in this simulation paper:

Predicting group-level outcome variables: An empirical comparison of analysis strategies

Summary of the paper: while the CvV method does indeed eliminate bias in the OLS parameter estimates from models with group-level outcomes, this comes at the cost of an enormous amount of variance in the parameter estimates, so that the CvV method actually does worse in terms of mean squared error. Furthermore, the simple unadjusted group means procedure is essentially just as effective as the CvV method in controlling Type 1 and 2 error rates.

Related Question