Solved – Covariance matrix decomposition and coregionalization

constraintcovariance-matrixeigenvaluesmatrix decomposition

The original question (that can be seen at the bottom of this post) was replaced by this first edit (below)

EDIT I

I give more details about my problem.

First of all let suppose to have K vectors $\boldsymbol{\omega}_k = \{\omega_{i,k}\}_{i=1}^n$, $k=1,\dots ,K$ of length $n$, distributed as multivariate normals:
$$
\boldsymbol{\omega}_k \sim N_n(\mathbf{0}_n, \mathbf{C}_k), k=1,\dots ,K.
$$
Now let suppose that i want to introduce dependence between the $\boldsymbol{\omega}_k$'s assuming that
$$
\boldsymbol{\omega}_{i,\cdot}=(\omega_{i,1},\omega_{i,2},\dots , \omega_{i,K})' \sim N_K(\mathbf{0}_k, \boldsymbol{\Sigma}), i=1,\dots , n
$$
notice that the $\omega$'s have all the same index $i$.
Now we can ask what is the covariance between $\boldsymbol{\omega}_{i,\cdot}$ and $\boldsymbol{\omega}_{j,\cdot}$, with $i\neq j$.

There is not a unique answer. We have to define a matrix $\mathbf{A}$ such that $\boldsymbol{\Sigma}=\mathbf{A}\mathbf{A}'$, and to find the matrices
$$
\mathbf{T}_k = [\mathbf{A}]_{\cdot,k} [\mathbf{A}]_{\cdot,k}', k=1,\dots ,K
$$
where $[\mathbf{A}]_{\cdot,k}$ is the $k^{th}$ column of $\mathbf{A}$.
Then the covariance between $\boldsymbol{\omega}_{i,\cdot}$ and $\boldsymbol{\omega}_{j,\cdot}$ is given by
$$
\sum_{k=1}^K \mathbf{T}_k [\mathbf{C}_k]_{ij}
$$
As you can see, the choice of $\mathbf{A}$ influences the covariance structure of my problem, and different $\mathbf{A}$'s give different models.

For example we can define $\mathbf{A}$ to be the Cholesky factorization, but then it is easy to see that we are saying that, marginally, the covariance matrix of $\boldsymbol{\omega}_1$ depends only on $\mathbf{C}_1$, the one of $\boldsymbol{\omega}_2$ depends on $\mathbf{C}_1$ and $\mathbf{C}_2$ and so on.

What I'm looking for is a $\mathbf{A}$ that let me able to have marginal distributions that do not depend on the ordering of the $\boldsymbol{\omega}$'s, i.e. that potentially the marginal density of $\boldsymbol{\omega}_1$ can be equal to the one of $\boldsymbol{\omega}_K$, that is not true for the Cholesky factorization and the spectral decomposition.

Old Question

Let suppose to have a covariance matrix $\boldsymbol{\Sigma}$. The spectral decomposition of a positive definite matrix tells us that we can write
$$
\boldsymbol{\Sigma} = \boldsymbol{\Psi}\boldsymbol{\Lambda}\boldsymbol{\Lambda}\boldsymbol{\Psi}'
$$
where the column vectors of $\boldsymbol{\Psi}$ are the normalized eigenvectors and $\boldsymbol{\Lambda}$ is a diagonal matrix where the $i^{th}$ element is the square root of the eigenvalue associated to the $i^{th}$ normalized eigenvector.

What I am interested in is the matrix
$$
\mathbf{A} = \boldsymbol{\Psi}\boldsymbol{\Lambda}
$$
where of course $\boldsymbol{\Sigma} =\mathbf{A}\mathbf{A}'$, assuming that the diagonal elements of $\boldsymbol{\Lambda}$ are placed in ascending order. It seems to me obvious that the space of $\mathbf{A}$ is constrained by the fact that I am imposing an ascending order of the eigenvalues, but what are precisely the constraints?

Best Answer

A covariance matrix has ${n \choose 2} + n = \frac{n(n+1)}{2}$ free elements. The constraints for the spectral decomposition are:

The eigenvalues are positive
The eigenvectors are orthogonal
The eigenvectors are unit length.

I recently asked a question that had to do with this. @amoeba had a good comment that helped visualize these constraints and proved why the number of free elements in $\Psi$ was ${n \choose 2}$, and the number of free elements in $\Lambda^2$ was $n$.

Regarding the ordering of the eigenvalues, that may or may not be important to you. The spectral decomposition is only unique up to re-orderings of your $\Lambda^2$ diagonal matrix. If your task is to estimate the parameters of this matrix, each of these re-orderings will have corresponding estimates. In this case the model will not be identifiable.

However, if you are just theorizing about the covariance matrix, people sometimes just restrict their attention to one ordering of the eigenvalues, without losing any generality. Proving something is true for one ordering will usually guarantee truth for the other orderings.

Edit to address comment:

1,2 and 3 are constraints that every covariance matrix has, so it is as "free" as possible. Sampling from some distribution of $\Sigma$ is possible as long as long as the distribution exists, but it is also common to restrict the columns of $\Psi$ further, which is the same as fixing the ordering of your eigenvalues. What I mean is that sometimes they will restrict it to have $\psi_{i,j} = 0$ when $j > i$ and $\psi_{j,j} = 1$. In other words, it will be lower-diagonal with $1$s on the diagonal. The are other ways to restrict this space, but it would look something like this: $$ \Psi = \left[\begin{array}{cccc} 1 & 0 & 0 & 0 \\ \psi_{2,1} & 1 & 0 & 0 \\ \vdots & \vdots & \ddots & \vdots \\ \psi_{p,1} & \psi_{p,2} & \cdots & 1 \\ \vdots & \vdots & \ddots & \vdots \\ \psi_{n,1} & \cdots & \psi_{n,p-1} & \psi_{n,p} \end{array} \right]. $$

Why is this representation unique? If you re-order the diagonal matrix $\Lambda^2$, then you have to re-order the columns of $\Psi$, but then the columns of $\Psi$ won't follow this pattern. Why do this? Well now your posterior is not multi-modal.

Best Answer

Edit to address comment:

Related Solutions

Solved – Confused about Cholesky and eigen decomposition

Solved – Decomposition of inverse covariance matrix

Related Question