[Math] Sherman–Morrison–Woodbury formula question

inversematrices

I asked a question earlier on Inverse of $A+B$ and $A+BCD$?

I got a very good hint on using the Sherman–Morrison–Woodbury formula. I used R to code the idea and using some arbitrary matrices, I tested the formula.

With low dimensions, the formula works fine but once the dimension increases (about $20\times 20$ or more), the part $(C^{-1} + V A^{-1} U)$ becomes singular and therefore $(C^{-1} + V A^{-1} U)^{-1}$ does not exist!

Does anyone know how to fix this? Wikipedia says that Sherman–Morrison–Woodbury formula is an exact formula which means its exactly equivalent to $(A + UCV)^{-1}$. It it's equivalent, whenever the inverse of $A + UCV$ exists, the Sherman–Morrison–Woodbury formula should also exist, right? then why I get error?

For your reference, below is the code I wrote in R:

lambda2 <- 0.5
eta2 <- 1
rho2 <- 1
sigma2 <- 0.25
alpha <- 2 
x <- seq(from = -2, to = 2, by = 0.2) # try by = 0.1, 0.05, 0.01

lengh.x <- length(x)
distanceMat.alpha <- ( abs(matrix(x, nrow = length(x), ncol = length(x)) - matrix(x,
  nrow = length(x), ncol = length(x), byrow = TRUE)) )^alpha

OneMat <- rep(1,lengh.x)%*%t(rep(1,lengh.x))

H <- lambda2*OneMat
G <- sigma2*diag(length(x))
A <- G + H
B <- eta2*exp(-rho2*(distanceMat.alpha))
CovMat.test <- A + B


# Getting Inverse using solve():
CovMat.test.Inv.SLOW <- solve(CovMat.test)


# Getting inverse using Sherman-Morrison-Woodbury formula:
A.inv <- (1/sigma2)*(diag(lengh.x) - (lambda2/(sigma2 + lengh.x*lambda2)) * OneMat)
B.eigen <- eigen(B, symmetric = TRUE)
Delta <- diag(B.eigen$values)
Delta.inv <- diag(1/B.eigen$values)
P <- B.eigen$vectors
# B can be decomposed as: P%*%Delta%*%t(P)
CovMat.test.Inv.FAST <- A.inv - A.inv%*%P%*%solve(Delta.inv +  
t(P)%*%A.inv%*%P)%*%t(P)%*%A.inv



# Test:
TwoMatAreEq <- sum(round(CovMat.test.Inv.FAST, 10) == round(CovMat.test.Inv.SLOW, 10)) 
== (dim(CovMat.test)[1])^2
TwoMatAreEq

Best Answer

You don't really fix this, you should just avoid taking inverses of nearly-singular matrices like the one you have defined. Even though $B$ in this case has an exact inverse, in a numerically-approximate calculation that R does it really matters whether it is approximately singular or not. R's function rcond will tell you the reciprocal condition number of $B$, and of $C^{-1}+V A^{-1} U$; if the answer is a very small number (e.g., $10^{-9}$ or less), then the numerically-approximate computation is poorly conditioned and any answer will be rather inaccurate.

To be honest, I don't understand why you are doing this at all. Finding an eigen-decomposition of the (almost singular) matrix $B$ is hard, and you gain nothing by doing this only to insert this into the Woodbury formula. If all you know about $B$ is that it is symmetric, just calculate $A+B$ and invert that. In general, it makes very little sense to invert an arbitrary matrix by computing its eigen-decomposition. Even if a matrix is symmetric (which guarantees that it can be diagonalized), why not just invert it directly? (R will do it for you easily.) Not to mention that while $B$ is almost singular and the inverse of its diagonal form is needed to apply the formula, the matrix $A+B$ is not quite so singular, and can be inverted easily.

You are not saving any work by using Woodbury identity on a generic matrix like $A+B$. Usually, it makes a lot more sense to use it when the matrix $BCD$ (in $A+BCD$) has some very special form that makes calculating the inverses in the Woodbury identity easy: often $C$ is 1, and $B^t$ and $D$ are vectors, or $2\times n$ matrices -- then the only inverse in the problem that is hard is $A^{-1}$; for example, $DA^{-1}B$ means just solving a system of linear equations when $B$ is a vector. With the matrix $B$ that you wrote down this does not hold: $B$ is just a generic matrix, so Woodbury identity offers no help.

Finally, in general it is not advisable to calculate the inverses of matrices directly. In other words, the usual practice is not to calculate $A^{-1}$ or any other inverse explicitly, but instead whenever you need to find $A^{-1}X$, solve the set of systems of linear equations $AY=X$ instead of computing $A^{-1}$ and multiplying by $X$. So whenever in your code you have solve(A) %*% X, it should really be replaced with solve(A,X).

Are you familiar with the standard techniques for solving systems of linear equations, like Gaussian elimination and LU factorization and the like, and how they would be used to calculate matrix inverses? Your question can probably be better answered by a chapter on linear equations in an introductory numerical analysis textbook.

Related Question