R Programming – Solving Non-Positive Definite Matrix Problem for Desired Correlation Structure

correlationmultivariate normal distributionrsimulation

I want to derive a correlation matrix such that block1 is 0.1 within itself, block2 is 0.1 within itself and 0.7 with block1, and the remaining variables are 0.01 within itself and with other blocks as follows. But I am facing non positive definite matrix problem. How can I get around this problem by preserving existing relationship structures?

library(mvtnorm)
library(lqmm) 

blck1= 1:3
blck2= 4:7

Sigma <- diag(10)
# Block 1: correlated variables
Sigma[c(blck1, blck2), ] <- 0.1
# Block 2: high correlated variables
Sigma[blck2, blck1] <- 0.7
# Block 3: low correlated variables
block3_p <- 10 - length(c(blck1, blck2))
Sigma[-c(blck1, blck2), ] <- 0.01
# Fix diagonal
diag(Sigma) <- 1
# Make symmetric
Sigma[upper.tri(Sigma)] <- t(Sigma)[upper.tri(Sigma)]

is.positive.definite(Sigma)
[1] FALSE

Z <- rmvnorm(n= 50, 
             mean  = rep(0, 10), 
             sigma = Sigma )


Warning message:
In rmvnorm(n = 50, mean = rep(0, 10), sigma = Sigma) :
  sigma is numerically not positive semidefinite

Best Answer

You have stumbled into the fact that you cannot simply make a correlation matrix by assembling individually valid pairwise correlations. There are many questions on the site related to this, have a look at

It might help with an intuitive example: Three persons are running along a linear road. It is impossible for all three of them to run in opposite directions, as there are only two directions!. So the three running velocities cannot all be negatively correlated!

But this example have negative and positive correlations, your have only positive. So let us extend it: Let A, B and C all run in the same direction. If A and B run with exactly the same speed, the correlation of C with those two must be equal. And, extending by continuity, if A and B have very similar speeds, C's correlations with them cannot be very dissimilar.

So the matrix you have assembled is simply not a valid correlation matrix! Maybe you should rather ask about your real problem, and tell us from where your correlations come? In fact, you must make rather large modifications to your matrix to make it valid. Some code (R):

make_Sigma <- function(rho1=0.1,  rho2=0.7,  rho3=0.01) {
blck1= 1:3
blck2= 4:7

Sigma <- diag(10)
Sigma[c(blck1, blck2), ] <- rho1
Sigma[blck2, blck1] <- rho2
Sigma[-c(blck1, blck2), ] <- rho3
diag(Sigma) <- 1
Sigma[upper.tri(Sigma)] <- t(Sigma)[upper.tri(Sigma)]
Sigma
} 

(your code from the Q assembled as a function, to facilitate experimentation). Then use that a valid correlation matrix cannot have negative eigenvalues:

min(eigen(make_Sigma(rho2=0.7), only.values=TRUE)$values)
[1] -1.17539

min(eigen(make_Sigma(rho2=0.35), only.values=TRUE)$values)
[1] 0.03652832

But maybe some smaller modifications on the smaller correlations is enough?

Related Question