Solved – How to account for multiple measurements of same person in either two-group comparision or regression

assumptionsclusteringmixed modelmultilevel-analysisregression

I am running analysis on clinical data collected from patients which are correlated either by time (longitudinally) or more commonly different measurements of the same person at same time (eg. measuring variables of each eye).

My question is about accounting for this correlation when I running statistics, as the above violates independence of my observations.

I have come across mixed effect models which I have now read a great deal about and I think that's what I need. So I will include both eyes in any regression, but add the person as the random effect.

However reading more about this topic, there seem to "Cluster-correlated robust estimates of variance" (which is popular in STATA I believe), and multiple topics talking about "clustered standard error" or "hierarchical modelling" which frankly are bit out of my depth.

So my questions are:

  1. When analyzing non-independent observations (eg. two eyes of same person) in regression, is mixed effect model the way to go? I have seen literature using clustered variance estimation. How is that different? (I'm using R for reference).

  2. Mixed effect models are all regression based. How would I go about doing the equivalent of t-test or mann whitney u test while accounting for non-independence issue?

Best Answer

  1. When analyzing non-independent observations (e.g. two eyes of same person) in regression, is mixed effect model the way to go?

In short: Yes.

Mixed models are capable of modelling the dependence or structure introduced in the data by the study design. In your example of measuring both eyes, you can use a mixed model with a random effect for individual, since individuals have two eyes and thus cause the dependence by being in the data twice.

However, you still cannot consider pseudoreplications to be true replicates in a mixed model. In many cases you can make more effective use of them in a mixed model, but the number of true replicates hasn't magically increased by changing the type of model.

That being said, the repeated measures you are describing are very common in medical research and can be modelled just fine with a mixed model.


  1. Mixed effect models are all regression based. How would I go about doing the equivalent of t-test or mann whitney u test while accounting for non-independence issue?

You can easily perform the equivalent of a $t$-test using a (mixed) regression model:

library(lme4)
lmer(y ~ x + (1 | rand))

Where x is a two-level factor. The first group of x will be the intercept and significance of x as an explanatory variable means there is a significant difference between the two groups.

As for the Mann-Whitney-U test, I'm not sure you could do a test based on ranks with a mixed model. However, you probably don't need to since you can either use a generalized linear mixed model (e.g. glmer(..., family = 'poisson')), or a non-linear mixed (see the nlme package).

Although the nlme package is great, I would recommend you not to jump to non-linear models too fast, because a GLMM is often easier to interpret and in many cases there is a logical choice for the theoretical distribution of the data-generating process in clinical research.


Alternatively, you could look into Bayesian hierarchical modelling, which is actually quite similar to mixed models, albeit a bit more difficult if you are not familiar with Bayesian statistics.

There are numerous models that try and model dependence or hierarchy. I am not familiar with "Cluster-correlated robust estimates of variance", but a mixed model with nested structure is essentially a hierarchical model.

Related Question