I have a 2×3 within-subjects design, with two different dependent variables (DVs). I would like to know if the two DVs are correlated or not.
Here is an example of what the data look like, e.g. a data frame in R:
# Make some data:
set.seed(1154)
data <- data.frame(id=gl(10, 6),
factor1=gl(2, 3, labels=c("A", "B")),
factor2=gl(3, 1),
DV1=rnorm(60),
DV2=rnorm(60))
head(data)
# Output:
# id factor1 factor2 DV1 DV2
# 1 1 A 1 0.255579320 1.72318604
# 2 1 A 2 0.133878731 -0.32694875
# 3 1 A 3 0.890576655 0.14834580
# 4 1 B 1 -0.007879094 -0.07145311
# 5 1 B 2 0.976311664 -0.40686813
# 6 1 B 3 0.701357069 -0.50813556
In R, I could do something like:
cor.test(data$DV1, data$DV2) # p = 0.048, significant
but there seem to be two problems with that.
First problem: the data are not independent (first 6 items from each DV come from the same participant in the experiment).
Second problem: we want to generalize from a sample to the population, so each id in the sample should just be included only once, e.g.:
# We want:
# id factor1 factor2 DV1 DV2
# 1 X X ... ...
# 2 X X ...
# 3 ...
# So:
library(plyr)
data2 <- ddply(data, .(id), summarize, mean.DV1=mean(DV1), mean.DV2=mean(DV2))
head(data2)
# Output:
# id mean.DV1 mean.DV2
# 1 1 0.49163739 0.09302105
# 2 2 0.66030997 -0.09344809
# 3 3 0.38277688 0.20274906
# 4 4 -0.35217913 0.57308528
# 5 5 -0.13470820 0.26663012
# 6 6 -0.04756911 0.60406950
Now I can look for a correlation and the responses are independent, but I have lost the individual factor levels.
cor.test(data2$mean.DV1, data2$mean.DV2) # p = .15, not significant
What is the correct way to check for a correlation between the two dependent variables (using R)?
Best Answer
I think your request for the "overall correlation" may be asking the wrong question. If you already know that you have varied factor1 and factor2, the correlations you want to look for are conditional the combination of those factors. It is unlikely the independent variables have absolutely 0 effect on the dependent variables, so looking at the total correlation actually includes less information than looking at each individually.
R code: