Iterative algorithm for computing $\Sigma^{1/2} x$

algorithmsmatricesnumerical linear algebrapositive-semidefinitesymmetric matrices

Say I have a PSD matrix $\Sigma$ and a vector $x$, is there an iterative algorithm (faster than computing $\Sigma^{1/2}$ using Cholesky decomposition) for computing $\Sigma^{1/2} x$?
(In this question, by $\Sigma^{1/2}$ I always just mean some matrix $L$ such that $\Sigma=LL^T$.)

For example, I might use Newton's method:
$
L_{k+1} = \frac12 (L_k + L^{−1}_k \Sigma), L_0 = \Sigma,
$
which converges to $L_{\infty} = \Sigma^{1/2}$.
If applied to a single vector:
\begin{align}
L_{k+1} x = \frac12 (L_k x + L^{−1}_k \Sigma x), L_0 = \Sigma,
\end{align}
I can compute each step in $O(n^2 \sqrt{\kappa})$ time using Conjugate gradient descent (where $\kappa$ is the condition number of $L_k$) to compute $L_k^{-1} (\Sigma x)$.

I'm not sure how many iterations of Newton I need (Or Denman-Beavers or some other type of iteration), but I assume it also depends on the condition number somehow.
All in all I can compute $\Sigma^{1/2} x$ in $O(n f(\kappa))$ time.

However, using gradient descent at each step of an iterative method seems silly. I doubt the above method is every competitive with just computing $\Sigma^{1/2}$ using Cholesky.
But I do wonder: Is there an iterative method to compute $\Sigma^{1/2} x$ fast?
For example something like the conjugate method (or even Jacobi iteration), but which has as purpose to compute the square root (or inverse square root) rather than the inverse?

(Btw, I should mention that the reason I want to compute $\Sigma^{1/2} x$ is to sample a multivariate normal distribution. If there's a different method to do this in $n^2$ time, I'd also be very interested in that.)

Update: According Computing Aα, log(A) and Related Matrix Functions by Contour Integrals Theorem 2.1, eq. 2.9, there is at least an iterative square root method that only takes $\approx\log(\kappa)$ many iterations, where each iteration requires a linear solve.
That means we can find $\Sigma^{1/2}x$ in $\approx n^2 \sqrt{\kappa} \log(\kappa)$.
Though I'm still curious if you could get just $n^2 \sqrt{\kappa}$ by utilizing that the later linear solves are very close to the previous ones.

def sqrtm_vp(A, x, m=None, M=None, niter=10): if m is None or M is None: eigs = np.linalg.eig(A)[0] m, M = min(eigs), max(eigs) Kp = scipy.special.ellipk(1 - m / M) t = (0.5 + np.arange(niter)) * Kp / niter sn, cn, dn, _ = scipy.special.ellipj(t, 1 - m / M) w2 = m * (sn / cn) ** 2 dzdt = (2 * Kp * np.sqrt(m) / (np.pi * niter)) * dn / cn**2 d = A.shape[0] Sx = np.zeros(d) # Could do this in parallel for j in range(niter): Sx += dzdt[j] * np.linalg.solve(A + w2[j] * np.eye(d), x) return A @ Sx

Best Answer

It turns out the "Newton based" algorithm in my original question doesn't work, since it requires solving a linear system in $L_k$. But we don't actually keep $L_k$ around, only $L_k x$. Luckily there is another iterative method that does work: I found this interesting article Computing $A^\alpha$ , log(A) and Related Matrix Functions by Contour Integrals.

Translated from Matlab to Python, and specialized to finding the "squareroot-vector-product", it looks like this:

The paper provides error bounds on the order of $\exp(-O(N/\log(\kappa)))$, meaning we need just $N=O(\log\kappa)$ iterations, each of which take a single linear solve, which is $O(d^2 \sqrt{\kappa})$ time with conjugate gradient descent.

I still don't know if it's possible to go from $d^2 \sqrt{\kappa}\log{\kappa}$ time to just $d^2 \sqrt{\kappa}$, but at least this is pretty close.

Best Answer

Related Solutions

[Math] Iterative refinement algorithm for computing exp(x) with arbitrary precision

[Math] Efficient computation of Cholesky decomposition during tridiagonal matrix inverse

Related Question