To follow up on @cardinal's comment: your $x$, $y$, and $z$ define a $(3 \times 3)$ correlation matrix $R$. Since a correlation matrix also is a possible covariance matrix (of standardized variables), it has to be positive definite. This is the case if all eigenvalues are $> 0$. If $R$ is indeed positive definite, then all vectors $\boldsymbol{s}$ of variances (i.e., numbers $> 0$) will turn $\boldsymbol{R}$ into a positive definite covariance matrix $\boldsymbol{\Sigma} = \boldsymbol{D}_{s}^{1/2} \boldsymbol{R} \boldsymbol{D}_{s}^{1/2}$, where $\boldsymbol{D}_{s}^{1/2}$ is the square root of the diagonal matrix made from $\boldsymbol{s}$.
So just construct $R$ from $x, y, z$, and check if the eigenvalues are all $> 0$. If so, you're good, and you can transform any set of data to have a corresponding covariance matrix with arbitrary variances:
x <- 0.5
y <- 0.3 # changing this to -0.6 makes it not pos.def.
z <- 0.4
R <- matrix(numeric(3*3), nrow=3) # will be the correlation matrix
diag(R) <- 1 # set diagonal to 1
R[upper.tri(R)] <- c(x, y, z) # fill in x, y, z to upper right
R[lower.tri(R)] <- c(x, y, z) # fill in x, y, z to lower left
eigen(R)$values # get eigenvalues to check if pos.def.
gives
[1] 1.8055810 0.7124457 0.4819732
So our $\boldsymbol{R}$ here is positive definite. Now construct the corresponding covariance matrix from arbitrary variances.
vars <- c(4, 16, 9) # the variances
Sigma <- diag(sqrt(vars)) %*% R %*% diag(sqrt(vars))
Generate some data matrix $\boldsymbol{X}$ that we will transform to later have exactly that covariance matrix.
library(mvtnorm) # for rmvnorm()
N <- 100 # number of simulated observations
mu <- c(1, 2, 3) # some arbitrary centroid
X <- round(rmvnorm(n=N, mean=mu, sigma=Sigma))
To do that, we first orthonormalize matrix $\boldsymbol{X}$, giving matrix $\boldsymbol{Y}$ with covariance matrix $\boldsymbol{I}$ (identity).
orthGS <- function(X) { # implement Gram-Schmidt algorithm
Id <- diag(nrow(X))
for(i in 2:ncol(X)) {
A <- X[ , 1:(i-1), drop=FALSE]
Q <- qr.Q(qr(A))
P <- tcrossprod(Q)
X[ , i] <- (Id-P) %*% X[ , i]
}
scale(X, center=FALSE, scale=sqrt(colSums(X^2)))
}
Xctr <- scale(X, center=TRUE, scale=FALSE) # centered version of X
Y <- orthGS(Xctr) # Y is orthonormal
Transform matrix $\boldsymbol{Y}$ to have covariance matrix $\boldsymbol{\Sigma}$ and centroid $\boldsymbol{\mu}$.
Edit: what's going on here: Do a spectral decomposition $\boldsymbol{\Sigma} = \boldsymbol{G} \boldsymbol{D} \boldsymbol{G}^{t}$, where $\boldsymbol{G}$ is the matrix of normalized eigenvectors of $\boldsymbol{\Sigma}$, and $\boldsymbol{D}$ is the corresponding matrix of eigenvalues. Now matrix $\boldsymbol{G} \boldsymbol{D}^{1/2} \boldsymbol{Y}$ has covariance matrix $\boldsymbol{G} \boldsymbol{D}^{1/2} Cov(\boldsymbol{Y}) \boldsymbol{D}^{1/2} \boldsymbol{G}^{t} = \boldsymbol{G} \boldsymbol{D} \boldsymbol{G}^{t} = \boldsymbol{\Sigma}$, as $Cov(\boldsymbol{Y}) = \boldsymbol{I}$.
eig <- eigen(Sigma)
A <- eig$vectors %*% sqrt(diag(eig$values))
XX1ctr <- t(A %*% t(Y)) * sqrt(nrow(Y))
XX1 <- sweep(XX1ctr, 2, mu, "+") # move centroid to mu
Check that the correlation matrix is really $\boldsymbol{R}$.
> all.equal(cor(XX1), R)
[1] TRUE
For other purposes, the question might now be: How do I find a positive definite matrix that is "very similar" to a pre-specified one that is not positive definite. That I don't know.
Edit: corrected some square roots
Solution Method A:
- If C is not symmetric, then symmetrize it. D <-- $0.5(C + C^T)$
- Add a multiple of the Identity matrix to the symmetrized C sufficient to make it positive definite with whatever margin, m, is desired, i.e., such that smallest eigenvalue of new matrix has minimum eigenvalue = m. Specifically, D <-- $D + (m - min(eigenvalue(D)))I$, where I is the identity matrix. D contains the desired positive definite covariance matrix.
In MATLAB, the code would be
D = 0.5 * (C + C');
D = D + (m - min(eig(CD)) * eye(size(D));
Solution Method B:
Formulate and solve a Convex SDP (Semidefinite Program) to find the nearest matrix D to C according to the frobenius norm of their difference, such that D is positive definite, having specified minimum eigenvalue m.
Using CVX under MATLAB, the code would be:
n = size(C,1);
cvx_begin
variable D(n,n)
minimize(norm(D-C,'fro'))
D -m *eye(n) == semidefinite(n)
cvx_end
Comparison of Solution Methods: Apart from symmetrizing the initial matrix, solution method A adjusts (increases) only the diagonal elements by some common amount, and leaves the off-diagonal elements unchanged. Solution method B finds the nearest (to the original matrix) positive definite matrix having the specified minimum eigenvalue, in the sense of minimum frobenius norm of the difference of the positive definite matrix D and the original matrix C, which is based on the sums of squared differences of all elements of D - C, to include the off-diagonal elements. So by adjusting off-diagonal elements, it may reduce the amount by which diagonal elements need to be increased, and diagoanl elements are not necessarily all increased by the same amount.
Best Answer
You have stumbled into the fact that you cannot simply make a correlation matrix by assembling individually valid pairwise correlations. There are many questions on the site related to this, have a look at
Non-transitivity of correlation: correlations between gender and brain size and between brain size and IQ, but no correlation between gender and IQ
Why does correlation matrix need to be positive semi-definite and what does it mean to be or not to be positive semi-definite?
It might help with an intuitive example: Three persons are running along a linear road. It is impossible for all three of them to run in opposite directions, as there are only two directions!. So the three running velocities cannot all be negatively correlated!
But this example have negative and positive correlations, your have only positive. So let us extend it: Let A, B and C all run in the same direction. If A and B run with exactly the same speed, the correlation of C with those two must be equal. And, extending by continuity, if A and B have very similar speeds, C's correlations with them cannot be very dissimilar.
So the matrix you have assembled is simply not a valid correlation matrix! Maybe you should rather ask about your real problem, and tell us from where your correlations come? In fact, you must make rather large modifications to your matrix to make it valid. Some code (R):
(your code from the Q assembled as a function, to facilitate experimentation). Then use that a valid correlation matrix cannot have negative eigenvalues:
But maybe some smaller modifications on the smaller correlations is enough?