Solved – Paired t-test and correlation

paired-data

Say I have $n$ pairs of observations. I can run a paired t-test to test for the significance of the difference in means. I can also look at the coefficient of correlation between the 2 sets of observations. What factors could be causing a low correlation coefficient but giving me statistically insignificant difference in means? To give some context, I measured the same units twice, and want to see how similar the results from the 2 measurements are.

Best Answer

The coefficient of correlation and paired t-test are getting at different things. The two tests don't need to align in terms of statistical significance. Consider the following four scenarios, coded in R.

# same mean, no correlation
# t.test Not significant
# cor.test Not significant
options(scipen = 99)
set.seed(1)
s1 <- rnorm(20,0,1)
s2 <- rnorm(20,0,1)
t.test(s1,s2,paired=T)$p.value
    cor.test(s1,s2)$p.value

# different means, no correlation
# t.test Significant
# cor.test Not significant
set.seed(2)
s1 <- rnorm(20,0,1)
s2 <- rnorm(20,2,1)
t.test(s1,s2,paired=T)$p.value
    cor.test(s1,s2)$p.value

# different means, high correlation
# t.test Significant
# cor.test Significant
set.seed(3)
s1 <- rnorm(20)
s2 <- s1+2+rnorm(20,0,0.5)
t.test(s1,s2,paired=T)$p.value
    cor.test(s1,s2)$p.value

# same means, high correlation
# t.test Not significant
# cor.test Significant
set.seed(4)
s1 <- rnorm(20)
s2 <- s1+rnorm(20,0,0.5)
t.test(s1,s2,paired=T)$p.value
    cor.test(s1,s2)$p.value

Not seeing a significant correlation between your two tests may be a sign the measurement error of your tests is high for your context. You want the standard deviations of your samples to be close to what you would see in practice and you need them to be much greater than your measurement error to detect a correlation in only 20 samples. Consider this final example where the measurement error is high in sample 2.

# same means, low correlation because
# high measurement error in sample 2
# t.test Not significant
# cor.test Not significant
set.seed(5)
s1 <- rnorm(20)
s2 <- s1+rnorm(20,0,3)
t.test(s1,s2,paired=T)$p.value
    cor.test(s1,s2)$p.value
Related Question