Solved – the difference between a hierarchical linear regression and an ordinary least squares (OLS) regression

least squaresmixed modelmultilevel-analysisregression

I am conducting a research whereby I have a few independent variables (all of them are dummies), moderators (one is a dummy, the other is continuous) and a continuous dependent variable.

I was told to use the ordinary least squares regression (OLS), but what is the difference between the OLS regression and a hierarchical linear regression analysis?

Best Answer

Building hierarchical models is all about comparing groups. The power of the model is that you can treat the information about a particular group as evidence relating how that group compares to the aggregate behavior for a particular level, so if you don't have a lot of information about a single group, that group gets pushed towards the mean for the level. Here's an example:

Let's say we wanted to build a linear model describing student literacy (perhaps as a function of grade-level and socioeconomic status) for a region. What's the best way to go about this? One naive way would be to just treat all the students in the region as one big group and calculate an OLS model for literacy rates at each grade level. There's nothing exactly wrong with this, but let's say that for a particular student, we know that they attend an especially good school out in the burbs. Is it really fair to apply the county-wide average literacy for their grade to this student? Of course not, their literacy will probably be higher than average because of our observation about their school. So as an alternative, we could develop a separate model for each school. This is great for big schools, but again: what about those small private schools? If we only have 15 kids in a class, we're probably not going to have a very accurate model.

Hierarchical models allow us to do both simultaneously. At one level, we calculate the literacy rate for the entire region. At another level, we calculate the school-specific literacy rates. The less information we have about a particular school, the more closely it will approximate the across-school mean. This also allows us to step up the model to consider other school districts, and maybe even go a level higher to compare literacy between states or even consider differences between countries. Anything going on all the way up at the country level won't have a huge impact all the way down at the county level because there are so many levels in between, but information is information and we should allow it the opportunity to influence our results, especially where we have very little data.

So if we have very little data on a particular school, but we know how schools in that country, state, and county generally behave, we can make some informed inferences about that school and treat new information as evidence against our beliefs informed by the larger groups (the higher levels in the hierarchy).