R Programming – Solving Non-Positive Definite Matrix Problem for Desired Correlation Structure

correlationmultivariate normal distributionrsimulation

I want to derive a correlation matrix such that block1 is 0.1 within itself, block2 is 0.1 within itself and 0.7 with block1, and the remaining variables are 0.01 within itself and with other blocks as follows. But I am facing non positive definite matrix problem. How can I get around this problem by preserving existing relationship structures?

library(mvtnorm)
library(lqmm) 

blck1= 1:3
blck2= 4:7

Sigma <- diag(10)
# Block 1: correlated variables
Sigma[c(blck1, blck2), ] <- 0.1
# Block 2: high correlated variables
Sigma[blck2, blck1] <- 0.7
# Block 3: low correlated variables
block3_p <- 10 - length(c(blck1, blck2))
Sigma[-c(blck1, blck2), ] <- 0.01
# Fix diagonal
diag(Sigma) <- 1
# Make symmetric
Sigma[upper.tri(Sigma)] <- t(Sigma)[upper.tri(Sigma)]

is.positive.definite(Sigma)
[1] FALSE

Z <- rmvnorm(n= 50, 
             mean  = rep(0, 10), 
             sigma = Sigma )


Warning message:
In rmvnorm(n = 50, mean = rep(0, 10), sigma = Sigma) :
  sigma is numerically not positive semidefinite

Best Answer

You have stumbled into the fact that you cannot simply make a correlation matrix by assembling individually valid pairwise correlations. There are many questions on the site related to this, have a look at

It might help with an intuitive example: Three persons are running along a linear road. It is impossible for all three of them to run in opposite directions, as there are only two directions!. So the three running velocities cannot all be negatively correlated!

But this example have negative and positive correlations, your have only positive. So let us extend it: Let A, B and C all run in the same direction. If A and B run with exactly the same speed, the correlation of C with those two must be equal. And, extending by continuity, if A and B have very similar speeds, C's correlations with them cannot be very dissimilar.

So the matrix you have assembled is simply not a valid correlation matrix! Maybe you should rather ask about your real problem, and tell us from where your correlations come? In fact, you must make rather large modifications to your matrix to make it valid. Some code (R):

make_Sigma <- function(rho1=0.1,  rho2=0.7,  rho3=0.01) {
blck1= 1:3
blck2= 4:7

Sigma <- diag(10)
Sigma[c(blck1, blck2), ] <- rho1
Sigma[blck2, blck1] <- rho2
Sigma[-c(blck1, blck2), ] <- rho3
diag(Sigma) <- 1
Sigma[upper.tri(Sigma)] <- t(Sigma)[upper.tri(Sigma)]
Sigma
}

(your code from the Q assembled as a function, to facilitate experimentation). Then use that a valid correlation matrix cannot have negative eigenvalues:

min(eigen(make_Sigma(rho2=0.7), only.values=TRUE)$values)
[1] -1.17539

min(eigen(make_Sigma(rho2=0.35), only.values=TRUE)$values)
[1] 0.03652832

But maybe some smaller modifications on the smaller correlations is enough?

Related Solutions

Solved – Create positive-definite 3×3 covariance matrix given specified correlation values

To follow up on @cardinal's comment: your $x$, $y$, and $z$ define a $(3 \times 3)$ correlation matrix $R$. Since a correlation matrix also is a possible covariance matrix (of standardized variables), it has to be positive definite. This is the case if all eigenvalues are $> 0$. If $R$ is indeed positive definite, then all vectors $\boldsymbol{s}$ of variances (i.e., numbers $> 0$) will turn $\boldsymbol{R}$ into a positive definite covariance matrix $\boldsymbol{\Sigma} = \boldsymbol{D}_{s}^{1/2} \boldsymbol{R} \boldsymbol{D}_{s}^{1/2}$, where $\boldsymbol{D}_{s}^{1/2}$ is the square root of the diagonal matrix made from $\boldsymbol{s}$.

So just construct $R$ from $x, y, z$, and check if the eigenvalues are all $> 0$. If so, you're good, and you can transform any set of data to have a corresponding covariance matrix with arbitrary variances:

x <- 0.5
y <- 0.3                            # changing this to -0.6 makes it not pos.def.
z <- 0.4
R <- matrix(numeric(3*3), nrow=3)   # will be the correlation matrix
diag(R) <- 1                        # set diagonal to 1
R[upper.tri(R)] <- c(x, y, z)       # fill in x, y, z to upper right
R[lower.tri(R)] <- c(x, y, z)       # fill in x, y, z to lower left
eigen(R)$values                     # get eigenvalues to check if pos.def.

gives

[1] 1.8055810 0.7124457 0.4819732

So our $\boldsymbol{R}$ here is positive definite. Now construct the corresponding covariance matrix from arbitrary variances.

vars  <- c(4, 16, 9)                # the variances
Sigma <- diag(sqrt(vars)) %*% R %*% diag(sqrt(vars))

Generate some data matrix $\boldsymbol{X}$ that we will transform to later have exactly that covariance matrix.

library(mvtnorm)                    # for rmvnorm()
N  <- 100                           # number of simulated observations
mu <- c(1, 2, 3)                    # some arbitrary centroid
X  <- round(rmvnorm(n=N, mean=mu, sigma=Sigma))

To do that, we first orthonormalize matrix $\boldsymbol{X}$, giving matrix $\boldsymbol{Y}$ with covariance matrix $\boldsymbol{I}$ (identity).

orthGS <- function(X) {             # implement Gram-Schmidt algorithm
    Id <- diag(nrow(X))
    for(i in 2:ncol(X)) {
        A <- X[ , 1:(i-1), drop=FALSE]
        Q <- qr.Q(qr(A))
        P <- tcrossprod(Q)
        X[ , i] <- (Id-P) %*% X[ , i]
    }
    scale(X, center=FALSE, scale=sqrt(colSums(X^2)))
}

Xctr <- scale(X, center=TRUE, scale=FALSE)  # centered version of X
Y    <- orthGS(Xctr)                        # Y is orthonormal

Transform matrix $\boldsymbol{Y}$ to have covariance matrix $\boldsymbol{\Sigma}$ and centroid $\boldsymbol{\mu}$.

Edit: what's going on here: Do a spectral decomposition $\boldsymbol{\Sigma} = \boldsymbol{G} \boldsymbol{D} \boldsymbol{G}^{t}$, where $\boldsymbol{G}$ is the matrix of normalized eigenvectors of $\boldsymbol{\Sigma}$, and $\boldsymbol{D}$ is the corresponding matrix of eigenvalues. Now matrix $\boldsymbol{G} \boldsymbol{D}^{1/2} \boldsymbol{Y}$ has covariance matrix $\boldsymbol{G} \boldsymbol{D}^{1/2} Cov(\boldsymbol{Y}) \boldsymbol{D}^{1/2} \boldsymbol{G}^{t} = \boldsymbol{G} \boldsymbol{D} \boldsymbol{G}^{t} = \boldsymbol{\Sigma}$, as $Cov(\boldsymbol{Y}) = \boldsymbol{I}$.

eig    <- eigen(Sigma)
A      <- eig$vectors %*% sqrt(diag(eig$values))
XX1ctr <- t(A %*% t(Y)) * sqrt(nrow(Y))
XX1    <- sweep(XX1ctr, 2, mu, "+")         # move centroid to mu

Check that the correlation matrix is really $\boldsymbol{R}$.

> all.equal(cor(XX1), R)
[1] TRUE

For other purposes, the question might now be: How do I find a positive definite matrix that is "very similar" to a pre-specified one that is not positive definite. That I don't know.

Edit: corrected some square roots

Solved – Generate normally distributed random numbers with non positive-definite covariance matrix

Solution Method A:

If C is not symmetric, then symmetrize it. D <-- $0.5(C + C^T)$
Add a multiple of the Identity matrix to the symmetrized C sufficient to make it positive definite with whatever margin, m, is desired, i.e., such that smallest eigenvalue of new matrix has minimum eigenvalue = m. Specifically, D <-- $D + (m - min(eigenvalue(D)))I$, where I is the identity matrix. D contains the desired positive definite covariance matrix.

In MATLAB, the code would be

D = 0.5 * (C + C');
D =  D + (m - min(eig(CD)) * eye(size(D));

Solution Method B: Formulate and solve a Convex SDP (Semidefinite Program) to find the nearest matrix D to C according to the frobenius norm of their difference, such that D is positive definite, having specified minimum eigenvalue m.

Using CVX under MATLAB, the code would be:

n = size(C,1);
cvx_begin
variable D(n,n)
minimize(norm(D-C,'fro'))
D -m *eye(n) == semidefinite(n)
cvx_end

Comparison of Solution Methods: Apart from symmetrizing the initial matrix, solution method A adjusts (increases) only the diagonal elements by some common amount, and leaves the off-diagonal elements unchanged. Solution method B finds the nearest (to the original matrix) positive definite matrix having the specified minimum eigenvalue, in the sense of minimum frobenius norm of the difference of the positive definite matrix D and the original matrix C, which is based on the sums of squared differences of all elements of D - C, to include the off-diagonal elements. So by adjusting off-diagonal elements, it may reduce the amount by which diagonal elements need to be increased, and diagoanl elements are not necessarily all increased by the same amount.

Best Answer

Related Solutions

Solved – Create positive-definite 3×3 covariance matrix given specified correlation values

Solved – Generate normally distributed random numbers with non positive-definite covariance matrix

Related Question