Statistical Test – What Statistical Test for Paired Data at Multiple Time Points?

anovahypothesis testingmixed modelt-test

I have paired data (right and left measurements from the same subject) taken at different times. For the sake of the argument let's say the dependent variable (y) follows a normal distribution and we'll use parametric tests.

I only have data for 10 subjects. I'm interested in knowing if there is a difference between right and left measurements at each time point and also whether this difference changes at other time points.

I can answer the first part of the question with a paired t-test, but how to do address the second -> ANOVA and post-hoc t-tests of the within-time differences?

Would a linear mixed model also work – or not for such a small sample?

An example of the data setup is:

enter image description here

Best Answer

As you already suggested yourself, you can model this problem with a linear mixed model (LMM). I don't think the sample size is small for an LMM at all. In fact, you'd have a much larger sample size than if you were to split the data and perform the tests you proposed on subsets (I advice against doing this in general).

You will want to reshape your data to be in long format for this.$^\dagger$ This also has the nice interpretation that columns represent variables (left, right and time point 1-3 in your example are actually categories of the variables side and time).

$\dagger$: I have included the $\textsf{R}$ code at the bottom of this answer to retain readability.


Although you mention these data are just an example, from this example you could already draw most of your conclusions using a single plot, labeled by subject:

spaghetti

There is no consistency in linepieces going up or down for either side ($\color{blue}{\text{left}}$ or $\color{red}{\text{right}}$) from any time point to the next. Hence it is unlikely you'll find any significant effects.

Using a random intercept for subject,$^\ddagger$ you'll indeed see that there is no effect of either time or side and there does not appear to be an interaction either. You could also try a random slope, or both, depending on your what you believe to be more correct.

LMM <- lmer(y ~ side * factor(time) + (1 | subject), df)
summary(LMM)

This model compares all other combinations to the right side of subjects at time point $1$ and gives the following output:

Fixed effects:
                        Estimate Std. Error t value
(Intercept)               130.30      11.28  11.553
sideright                   0.90      15.22   0.059
factor(time)2             -10.30      15.22  -0.677
factor(time)3             -10.70      15.22  -0.703
sideright:factor(time)2   -11.10      21.52  -0.516
sideright:factor(time)3     4.40      21.52   0.204

All $t$-values are small, suggesting none of these effects are significant. You can formally test this using the lmerTest package, or by bootstrapping confidence intervals, as I did below:

confint(LMM, method = "boot")

Produces:

                            2.5 %    97.5 %
.sig01                    0.00000  22.36850
.sigma                   26.94602  40.81779
(Intercept)             110.53644 149.45369
sideright               -25.09184  32.50712
factor(time)2           -39.56829  18.37244
factor(time)3           -40.84085  17.97641
sideright:factor(time)2 -51.79007  28.77454
sideright:factor(time)3 -42.22617  43.29726

All of the 95%-CI for the fixed effects include zero, meaning there are no significant deviations from the reference group. You could try to change the reference group, but make sure you correct for the number of comparisons you perform.

Lastly, if you try to recreate this analysis for the actual data, also note that time need not be modeled as a factor if you can assume y to linearly increase with time. This will ease the interpretation and requires less parameters.


The $\textsf{R}$ code I used for this:

# your example data in long format
df <- data.frame(
  y = c(79, 52, 97, 103, 159, 169, 157, 85, 167, 116, 171, 149, 107, 172, 96, 
        132, 168, 163, 111, 162, 
        65, 91, 124, 167, 94, 184, 107, 73, 182, 105, 145, 130, 135, 75, 88, 
        71, 62, 133, 96, 171, 
        98, 137, 117, 79, 98, 115, 164, 126, 151, 91, 168, 134, 118, 87, 108, 
        163, 102, 133, 125, 131),
  side = rep(c("right", "left"), 30),
  time = rep(1:3, each = 20),
  subject = factor(rep(1:10, each = 2, times = 3)))

library("car")  # for qqPlot()
library("lme4") # for mixed models

# the model
LMM <- lmer(y ~ side * factor(time) + (1 | subject), df)

# some simple visual diagnostics
plot(LMM)
qqPlot(resid(LMM))

# summary, confidence intervals
summary(LMM)
confint(LMM, method = "boot")

# the spaghetti plot in my answer
cols <- c("red", "blue")
plot(y ~ time, df, pch = as.character(subject), col = cols[as.numeric(side)],
    main = "Spaghetti Plot", xaxt = "n", cex = 0.6, cex.axis = 0.8)
axis(1, 1:3, 1:3, cex.axis = 0.8) # to prevent e.g. 0.5, 1.0, 1.5, ...
segments(x0 = 1.1, x1 = 1.9, y0 = df$y[df$time == 1], y1 = df$y[df$time == 2],
         col = cols[as.numeric(df$side)])
segments(x0 = 2.1, x1 = 2.9, y0 = df$y[df$time == 2], y1 = df$y[df$time == 3],
         col = cols[as.numeric(df$side)])
Related Question