[Math] Low-rank factorization of SPD matrix

linear algebramatricesnumerical linear algebra

I have a symmetric positive definite (SPD) matrix $A$ that needs to be factorized as ${A=SS^{T}}$. However, using the Cholesky decomposition for this purpose is prohibitive in terms of computational cost. Moreover, the matrix is dense and has a slow decaying eigen-spectrum. Can anything be suggested for replacement of Cholesky?

Moreover, it need not be exact. Anything approximate will work as long as $y=Sz$ and the quantity $y^{T}y$ is what I am trying to preserve, where $z$ is standard normal vector.

Best Answer

Sarlos and Clarkson and Woodruf give efficient streaming algorithms for the more general problem of computing an approximate singular value decomposition of an arbitrary matrix. Because of the sublinear space streaming constraints, they look at "tall" matrices: much fewer columns than rows. But the algorithm should work and give efficiency gains for square matrices as well. The idea is to project the columns onto a much smaller dimension, similarly to the proof of the Johnson-Lindenstraus lemma, and then work with this smaller matrix.

Related Solutions

[Math] Finding minimum (or maximum) element of a low rank matrix.

Here's a geometrical reformulation of the problem that yields an $O(n \log n)$ solution for $m=2$ and suggests a context that may yield good answers for arbitrary fixed $m$. [EDIT but see below for $m=3$ and $m \geq 4$.]

Denote the $i$-th coordinate of $u_k^{\phantom.}$ and $v_k^{\phantom.}$ by $u_k^{(i)}$ and $v_k^{(i)}$ respectively. Consider the $n$ vectors in $V := {\bf R}^m$ given by $$ x_i = (u_1^{(i)}, u_2^{(i)}, u_3^{(i)}, \ldots, u_n^{(i)}) $$ ($i=1,2,\ldots,n$), and the $n$ dual vectors $$ y_j = (v_1^{(j)}, v_2^{(j)}, v_3^{(j)}, \ldots, v_n^{(j)}) \in V^*. $$ ($j=1,2,\ldots,n$). Then the $(i,j)$ entry of $A$ is $y_j(x_i)$. Thus the problem asks for the minimum or maximum of $y_j(x_i)$ as $i,j$ range independently over $\lbrace 1, 2, \dots, n \rbrace$.

Note that given $A$ there are many choices of $u_k$ and $v_k$, but the choice is tantamount to a choice of basis on $V$ and of dual basis on $V^*$, so geometrically our $y_j(x_i)$ problem depends only on $A$.

Now it's clear that the minimizing and maximizing $x_i, y_j$ must be vertices of the convex hull of $\lbrace x_i \rbrace$ and $\lbrace y_j \rbrace$ respectively. This recovers the known solution for $m=1$, when any bounded convex subset of $V$ has (at most) two vertices, which can be found in $O(n)$ comparisons.

For $m=2$, it is still known how to find the vertices of the convex hull (in cyclic order) in $O(n \log n)$ steps [see for instance the "Convex hull algorithms" Wikipage for references]. Once we know the vertices of the convex hull of the $x_i$, we can for each $j$ find the minimal and maximal $y_j(x_i)$ in $O(\log n)$ steps by bisection, making the overall computational cost still $O(n \log n)$.

Finding the convex hull and its structure for $m=3$, and larger fixed $m$, is harder, but at least there's some literature on this problem, and experts who can suggest good ways to proceed.

EDIT Alas the current state of the art for computing convex hulls apparently limits this approach to $m=2$ and maybe $m=3$.

The Mathworld entry for "Convex Hull" reports on an $O(n \log n)$ algorithm also in dimension 3, referring to

Skiena, S. S. "Convex Hull." $\S$8.6.2 in The Algorithm Design Manual. New York: Springer-Verlag, pp. 351–354, 1997.

But this still leaves open the question of computing in time $\tilde O(n)$ a data structure that will find the extreme point in each direction in only $n^{o(1)}$ operations, or even in significantly less than the number of vertices of the convex hull. Still, if at least one of $\lbrace x_i \rbrace$ and $\lbrace y_j \rbrace$ has a convex hull with significantly fewer than $n$ vertices then this will improve on the exhaustive search over $n^2$ entries of the matrix.

Once $m \geq 4$ it seems that it is not even known how to compute the set of vertices of the convex hull in as few as $n^2$ operations; unless the number of operations can be brought significantly below $n^2$, this means that the technique described here is limited to $m=2$ and maybe $m=3$. TIDE

All of this analysis assumes that we don't run into difficulties like $m=1$, $u_1 = (-1,2.54)$, $v_1 = (1,-.3937)$, where there are two or more very close candidates for the minimum. To deal with that, we might assume that we can do exact arithmetic (perhaps the coordinates are quantized with fixed denominator), or tolerate an error that can be brought below $\epsilon$ in $O(\log(1/\epsilon))$ steps.

[Math] Eigenvectors of a symmetric positive definite Toeplitz matrix

First, a meta-comment. Numerical linear algebra is de facto a part of engineering or, more precisely, computer science (practical sciences studying how we do something and dealing with our practical relationships to mathematical objects and not the primary objects themselves) so servers about computer science similar to this one may be more helpful in solving such questions.

Now, my guess. The problem to find eigenstates of a Toeplitz matrix is not exactly solvable in the same sense as in the special case of the circulant which makes me expect that the solutions to this problem won't be substantially faster than the solution of the problem for general matrices.

But if you can find the eigenvalues by the method you mentioned, I believe that the problem to find eigenstates is a special case of a Toeplitz system,

https://en.wikipedia.org/wiki/Toeplitz_matrix#Solving_a_Toeplitz_system

although a singular one, and there exists an $O(n^2)$ algorithm for that, the Levinson algorithm

https://en.wikipedia.org/wiki/Levinson_recursion

The Toeplitz system is the problem to find $x$ obeying $Ax=b$ for a given Toeplitz matrix $A$ and a column $b$. If you know an eigenvalue $\lambda$ and you want to find the eigenvector $x$, you are solving $(A-\lambda\cdot {\bf 1})x=0$ which seems like a Toeplitz system, one with a shifted (still Toeplitz) new matrix $A-\lambda\cdot {\bf 1}$, albeit a system with $b=0$. Of course, the solution is not unique – the scaling is arbitrary – and my experience is non-existent and preventing me from saying whether the method breaks down for $b=0$ or $b\to 0$, which you may try if $b=0$ doesn't work, but it seems as an obvious attempt constructed from the pieces that are out there.

Best Answer

Related Solutions

[Math] Finding minimum (or maximum) element of a low rank matrix.

[Math] Eigenvectors of a symmetric positive definite Toeplitz matrix

Related Question