Solved – Comparing the within-subject variance between two groups of subjects

If each subject has repeated a test multiple times, I am able to compute the mean and variance of their performance. (I am assuming the results to be normally distributed).

1) If it is the variance that is of most interest to me then I believe that it is OK to summarise the variance of a group of subjects by taking the mean of their variances?

2) Would it also be meaningful to look at the variance of the subject variances?

3) I believe this (or a similar) statistic will be required if I am to use a T-test to assess whether the difference in variance between groups is statistically significant?

Very simple example (edited to correct the variances as noted in one answer)

Andrew: 5, 2, 7 (var ~~4.22~~ 6.33)

Bob: 3, 3, 2 (var ~~0.22~~ 0.33)

Charles: -2, 1 , 0 (var ~~1.55~~ 2.33)

Diane: 4, 2, 3 (var ~~0.66~~ 1.00)

Elsa: 6, -4, -1 (var ~~17.55~~ 26.3)

Fran: 6, 0, 3 (var ~~6.00~~ 9.00)

Male mean variance (Andrew, Bob and Charles): ~~2.00~~ 3.00

Female mean variance (Diane, Elsa and Fran): ~~8.07~~ 12.1

Question: Is there a significant difference between the male and female within-subject variance for this test?

subject <- c(1,1,1,2,2,2,3,3,3,4,4,4,5,5,5,6,6,6) score <- c(5,2,7,3,3,2,-2,1,0,4,2,3,6,-4,-1,6,0,3) test <- rep(c(1,2,3),6) sex <- c(rep(1,9), rep(0,9)) testdata <- data.frame(subject, score, test, sex) testdata

subject score test sex 1 1 5 1 1 2 1 2 2 1 3 1 7 3 1 4 2 3 1 1 5 2 3 2 1 6 2 2 3 1 7 3 -2 1 1 8 3 1 2 1 9 3 0 3 1 10 4 4 1 0 11 4 2 2 0 12 4 3 3 0 13 5 6 1 0 14 5 -4 2 0 15 5 -1 3 0 16 6 6 1 0 17 6 0 2 0 18 6 3 3 0

m1 <- lme(score ~ sex + factor(test), random=~1|subject, data=testdata) m2 <- lme(score ~ sex + factor(test), random=~1|subject, weights=varIdent(form=~1|sex), data=testdata) anova(m1, m2) Model df AIC BIC logLik Test L.Ratio p-value m1 1 6 167.8734 176.6679 -77.93673 m2 2 7 169.8450 180.1051 -77.92247 1 vs 2 0.02850744 0.8659

score <- c(5,2,7,3,3,2,-2,1,0,4,2,3,6,-4,-1,6,0,-17) testdata <- data.frame(subject, score, test, sex) m2 <- lme(score ~ sex + factor(test), random=~1|subject, weights=varIdent(form=~1|sex), data=testdata) anova(m1, m2) Model df AIC BIC logLik Test L.Ratio p-value m1 1 6 105.3948 109.2291 -46.69739 m2 2 7 102.7737 107.2471 -44.38685 1 vs 2 4.621064 0.0316

Best Answer

You can test this by fitting a linear mixed model. A linear mixed model is like a multiple regression model but you can have random effects. The random effects part is needed because you have multiple tests per observation. You will then model score as a function of sex and test, and the subjects are your random effects. Let's enter your test data in R:

The data would look like this:

Now we fit two models. Both models use sex and test (1-3 in this case) as fixed effects and subject as random effect. The difference between the models is that in the second model, the variance is allowed to differ between women and men. We then compare the models using the anova() command, and if there is a significant difference, this indicates that the more complex model (the one with the differing variances per sex) provides a better fit, and we thus have indirect evidence that the difference in variance is statistically significant:

There was no difference in this example. But if we change the score for the last female a little (changing score from 3 to -17, increasing the variance) and run the m2 model and the comparison again:

Now we see a difference in AIC and logLik, and a low p-value which indicates a difference in variance between the sexes.

Best Answer

Related Solutions

Related Question