Generate two random correlation matrices which share equal correlations

correlation matrixcovariance-matrixlinear algebramatrixrandom-generation

My setting is, I want to simulate a data set in two conditions, e.g. control and disease. I want them to share mostly the same correlations except some should be different to simulate a "signal" between the two conditions.

So I'm trying to simulate two correlation matrices (i.e. they have to be positive semidefinite, symmetric with ones on the diagonal) which share some equal correlations, e.g. the first 2 rows should have the same correlation between the two, the rest is random/shuffled, like this:

Example of the two correlation matrices

Note how A and B have the same correlation values in both (first two rows and columns are the same), the rest is different.

Is this possible? I can generate a random correlation matrix using the last method described here, but I can't figure out how to change/shuffle a subset of the matrix still keep them positive semidefinite. I tried shuffling the elements of the rows except the diagonal values and make it symmetric again with

cor.mat[lower.tri(cor.mat)] <- t(cor.mat)[lower.tri(cor.mat)]

but then the matrix isn't positive semidefinite anymore.

Any suggestions would be greatly appreciated!

EDIT:

I found this post where the question was to complete a partial matrix to be positive definite one, could this be used in my case? E.g. generate one matrix, delete the rows I want to randomize and fill it up with random values?

Unfortunately I'm no mathematician and I don't really understand the explanations and can't translate the answers into R code, could someone help with this?

Best Answer

In case anyone also has this problem, I think I found a solution that works for me:

I simulate one correlation matrix, duplicate it and then randomize some values in the duplicate, resulting in the second "condition" matrix. To get this new matrix to be positive semi-definite as well, I use the R function nearPD() in the Matrix package. nearPD() finds the nearest positive semi-definite by adjusting all values slightly till a result is found that fulfills the criteria. This doesn't give me exactly what I wanted (since all values are adjusted in the new matrix, they don't match exactly with the values in the original matrix) but it's close enough to work in my case.