I have covariance matrix known to be
$$K = \sum_{i=1}^Nx_ix_i^T$$
where the dimension of $x$ is big (like $50000$) so I don't want to really compute any outer-product to expand it as a full matrix. Also, I know this covariance matrix is sparse
Since K is guaranteed to be positive-definite, there is a unique Cholesky decomposition :
$$K = L^TL$$
Two questions:
-
is there a way to update $L$ sequentially (update Cholesky factor after seeing each data point).
-
is there approximation of Cholesky that keeps $L$ sparse or low-rank to save memory ?
Best Answer
This reminds me of approximation for the covariance matrix in Gaussian Process, to keep it small, when we see a lot of data. If you want to find the decomposition, after seeing new data, the naive way is to just do it again. But one can keep a small subset of the data (the sparsity assumption), and use the remaining data to perform the matrix decomposition. See this recent work: http://www.comp.nus.edu.sg/~lowkh/pubs/uai2013.pdf