Solved – R: compute correlation by group

correlationr

In R, I have a data frame comprising a class label C (a factor) and two measurements, M1 and M2. How do I compute the correlation between M1 and M2 within each class?

Ideally, I'd get back a data frame with one row for each class and two columns: the class label C and the correlation.

Best Answer

The package plyr is the way to go.

Here is a simple solution:

xx <- data.frame(group = rep(1:4, 100), a = rnorm(400) , b = rnorm(400) )
head(xx)

require(plyr)
func <- function(xx)
{
return(data.frame(COR = cor(xx$a, xx$b)))
}

ddply(xx, .(group), func)

The output will be:

  group         COR
1     1  0.05152923
2     2 -0.15066838
3     3 -0.04717481
4     4  0.07899114