Solved – Obtaining covariance matrix from correlation matrix

correlationcovariancersimulation

I am trying to figure out how to convert a correlation matrix (R) to a covariance matrix (S) for input into a random number generator that only accepts S (rmvnorm("mvtnorm") in R)

library("mvtnorm") 

TRUTH= 0.8 # target correlation value between X1 and X2
R <- as.matrix(data.frame(c(1, TRUTH), c(TRUTH, 1)))
V <- diag(c(sqrt(1), sqrt(1))) # diagonal matrix of sqrt(variances)
S <- V %*% R %*% V
cor(rmvnorm(100, sigma=S) )

# repeat this to get an idea of the variance around Pearson's estimator

Instance where variances are not equal to 1

V <- diag(c(sqrt(3), sqrt(2))) 
S <- V %*% R %*% V
cor(rmvnorm(100, sigma=S) )

This seems to be correct, but I would like expert criticism.

Best Answer

Let $R$ be the correlation matrix and $S$ the vector of standard deviations, so that $S\cdot S$ (where $\cdot$ is the componentwise product) is the vector of variances. Then $$ \text{diag}(S) R \text{diag}(S) $$ is the covariance matrix. This is fully explained here.

This can be implemented in R as

cor2cov_1 <- function(R,S){
    diag(S) %*% R %*% diag(S)
}

but is inefficient. An efficient implementation is

cor2cov <- function(R, S) {
 sweep(sweep(R, 1, S, "*"), 2, S, "*")
 }

and you can test yourself they give the same result.

TRUTH= 0.8 
R <- as.matrix(data.frame(c(1, TRUTH), c(TRUTH, 1)))
S = c(sqrt(1), sqrt(1))

cor2cov_1(R,S)

outer(S,S) * R 

smat = as.matrix(S)
R * smat %*% t(smat)

Here is a microbenchmark showing the efficiency of the functions:

library(microbenchmark)
microbenchmark::microbenchmark(outer(S,S) * R ,cor2cov_1(R,S), cor2cov(R,S), R * smat %*% t(smat), times = 10000)

Unit: microseconds
                 expr     min      lq       mean  median      uq      max neval cld
      outer(S, S) * R   1.968   2.214   2.724639   2.337   2.460 3611.362 10000  a 
      cor2cov_1(R, S)   1.722   1.886   2.778045   1.968   2.091 3743.259 10000  a 
        cor2cov(R, S) 113.037 116.071 125.844711 118.039 120.663 5462.020 10000   b
 R * smat %*% t(smat)   1.066   1.230   1.422712   1.435   1.517   12.177 10000  a