Given a symmetric, positive definite, covariance matrix $S$,
how can I calculate a matrix $C$ such that
$CSC^T$ is the Identity matrix $I$?
covariancevariance
Given a symmetric, positive definite, covariance matrix $S$,
how can I calculate a matrix $C$ such that
$CSC^T$ is the Identity matrix $I$?
To follow up on @cardinal's comment: your $x$, $y$, and $z$ define a $(3 \times 3)$ correlation matrix $R$. Since a correlation matrix also is a possible covariance matrix (of standardized variables), it has to be positive definite. This is the case if all eigenvalues are $> 0$. If $R$ is indeed positive definite, then all vectors $\boldsymbol{s}$ of variances (i.e., numbers $> 0$) will turn $\boldsymbol{R}$ into a positive definite covariance matrix $\boldsymbol{\Sigma} = \boldsymbol{D}_{s}^{1/2} \boldsymbol{R} \boldsymbol{D}_{s}^{1/2}$, where $\boldsymbol{D}_{s}^{1/2}$ is the square root of the diagonal matrix made from $\boldsymbol{s}$.
So just construct $R$ from $x, y, z$, and check if the eigenvalues are all $> 0$. If so, you're good, and you can transform any set of data to have a corresponding covariance matrix with arbitrary variances:
x <- 0.5
y <- 0.3 # changing this to -0.6 makes it not pos.def.
z <- 0.4
R <- matrix(numeric(3*3), nrow=3) # will be the correlation matrix
diag(R) <- 1 # set diagonal to 1
R[upper.tri(R)] <- c(x, y, z) # fill in x, y, z to upper right
R[lower.tri(R)] <- c(x, y, z) # fill in x, y, z to lower left
eigen(R)$values # get eigenvalues to check if pos.def.
gives
[1] 1.8055810 0.7124457 0.4819732
So our $\boldsymbol{R}$ here is positive definite. Now construct the corresponding covariance matrix from arbitrary variances.
vars <- c(4, 16, 9) # the variances
Sigma <- diag(sqrt(vars)) %*% R %*% diag(sqrt(vars))
Generate some data matrix $\boldsymbol{X}$ that we will transform to later have exactly that covariance matrix.
library(mvtnorm) # for rmvnorm()
N <- 100 # number of simulated observations
mu <- c(1, 2, 3) # some arbitrary centroid
X <- round(rmvnorm(n=N, mean=mu, sigma=Sigma))
To do that, we first orthonormalize matrix $\boldsymbol{X}$, giving matrix $\boldsymbol{Y}$ with covariance matrix $\boldsymbol{I}$ (identity).
orthGS <- function(X) { # implement Gram-Schmidt algorithm
Id <- diag(nrow(X))
for(i in 2:ncol(X)) {
A <- X[ , 1:(i-1), drop=FALSE]
Q <- qr.Q(qr(A))
P <- tcrossprod(Q)
X[ , i] <- (Id-P) %*% X[ , i]
}
scale(X, center=FALSE, scale=sqrt(colSums(X^2)))
}
Xctr <- scale(X, center=TRUE, scale=FALSE) # centered version of X
Y <- orthGS(Xctr) # Y is orthonormal
Transform matrix $\boldsymbol{Y}$ to have covariance matrix $\boldsymbol{\Sigma}$ and centroid $\boldsymbol{\mu}$.
Edit: what's going on here: Do a spectral decomposition $\boldsymbol{\Sigma} = \boldsymbol{G} \boldsymbol{D} \boldsymbol{G}^{t}$, where $\boldsymbol{G}$ is the matrix of normalized eigenvectors of $\boldsymbol{\Sigma}$, and $\boldsymbol{D}$ is the corresponding matrix of eigenvalues. Now matrix $\boldsymbol{G} \boldsymbol{D}^{1/2} \boldsymbol{Y}$ has covariance matrix $\boldsymbol{G} \boldsymbol{D}^{1/2} Cov(\boldsymbol{Y}) \boldsymbol{D}^{1/2} \boldsymbol{G}^{t} = \boldsymbol{G} \boldsymbol{D} \boldsymbol{G}^{t} = \boldsymbol{\Sigma}$, as $Cov(\boldsymbol{Y}) = \boldsymbol{I}$.
eig <- eigen(Sigma)
A <- eig$vectors %*% sqrt(diag(eig$values))
XX1ctr <- t(A %*% t(Y)) * sqrt(nrow(Y))
XX1 <- sweep(XX1ctr, 2, mu, "+") # move centroid to mu
Check that the correlation matrix is really $\boldsymbol{R}$.
> all.equal(cor(XX1), R)
[1] TRUE
For other purposes, the question might now be: How do I find a positive definite matrix that is "very similar" to a pre-specified one that is not positive definite. That I don't know.
Edit: corrected some square roots
For a sample of vectors $x_i=(x_{i1},\dots,x_{ik})^\top$, with $i=1,\dots,n$, the sample mean vector is $$ \bar{x}=\frac{1}{n} \sum_{i=1}^n x_i \, , $$ and the sample covariance matrix is $$ Q = \frac{1}{n} \sum_{i=1}^n (x_i-\bar{x})(x_i-\bar{x})^\top \, . $$ For a nonzero vector $y\in\mathbb{R}^k$, we have $$ y^\top Qy = y^\top\left(\frac{1}{n} \sum_{i=1}^n (x_i-\bar{x})(x_i-\bar{x})^\top\right) y $$ $$ = \frac{1}{n} \sum_{i=1}^n y^\top (x_i-\bar{x})(x_i-\bar{x})^\top y $$ $$ = \frac{1}{n} \sum_{i=1}^n \left( (x_i-\bar{x})^\top y \right)^2 \geq 0 \, . \quad (*) $$ Therefore, $Q$ is always positive semi-definite.
The additional condition for $Q$ to be positive definite was given in whuber's comment bellow. It goes as follows.
Define $z_i=(x_i-\bar{x})$, for $i=1,\dots,n$. For any nonzero $y\in\mathbb{R}^k$, $(*)$ is zero if and only if $z_i^\top y=0$, for each $i=1,\dots,n$. Suppose the set $\{z_1,\dots,z_n\}$ spans $\mathbb{R}^k$. Then, there are real numbers $\alpha_1,\dots,\alpha_n$ such that $y=\alpha_1 z_1 +\dots+\alpha_n z_n$. But then we have $y^\top y=\alpha_1 z_1^\top y + \dots +\alpha_n z_n^\top y=0$, yielding that $y=0$, a contradiction. Hence, if the $z_i$'s span $\mathbb{R}^k$, then $Q$ is positive definite. This condition is equivalent to $\mathrm{rank} [z_1 \dots z_n] = k$.
Best Answer
Since I haven't found a duplicate and nobody else has pointed one out, I have expanded my comment into an answer and added an example to show you how it works.
Any invertible matrix $A$ for which $AA^T=S$ will work by taking $C=A^{−1}$ .
So for example, Cholesky decomposition is one common way to do it - it's a simple algorithm for finding a lower triangular $A$ (or equivalently, in many implementations, an upper triangular $A^T$). That is, it finds $S=LL^T$, whence $L^{−1}S(L^{−1})^T=I$.
Example:
Consider the matrix $$S=\left(\begin{matrix} 4 & 1 \\ 1 & 2.5 \end{matrix}\right) $$
We write $S = LL^T$, which implies that
$$s_{11} = l_{11}^2$$
That is, $l_{11}=2$ (we'll take positive square roots each time to make our diagonals on $L$ positive).
It also implies that $s_{21} = l_{21}l_{11}$, so $l_{21} = s_{21}/l_{11}$, i.e. $l_{21} = 1/2$. Then we see that $s_{22} = l_{21}^2+l_{22}^2$, so $l_{22}^2 = s_{22} - l_{21}^2 = 2.5 - 0.5^2 = 2.25$, in which case $l_{22} = 1.5$, so
$$L=\left(\begin{matrix} 2 & 0 \\ 0.5 & 1.5 \end{matrix}\right) $$
You can confirm that
$$C = L^{-1}=\left(\begin{matrix} \frac{1}{2} & 0 \\ -\frac{1}{6} & \frac{2}{3} \end{matrix}\right) $$
and that $CSC^T = I$.
In practice we don't invert $L$, however, but solve a linear system, it's more stable.
Another possibility is to use Singular Value Decomposition to find a matrix to do the diagonalizing, and then scale by the singular values.
But a number of other choices are available. (As whuber points out, a very large number.)
In practice, frequently we know $S$ is of the form $X^TX$ for some $X$. In that case, it's usually better to work on $X$ than $S$, via the QR decomposition.