[Math] Sparse approximation of the inverse of a sparse matrix

approximation-algorithmsmatricesna.numerical-analysisnonlinear optimizationnumerical linear algebra

Is it possible to approximate an inverse of a sparse matrix with a sparse matrix?
The problem comes up in numerical non-linear quasi-Newton optimization: given a sparse Hessian a good starting point for L-BFGS would be a sparse approximation of the Hessian's inverse, with the emphasis on sparsity of the matrix rather than the accuracy of approximation.

EDIT 1: Given Noam's counterexample, an approximation doesn't seem possible. However, the end practical purpose of an approximate sparse inverse is not as much its sparsity as the ability to represent the inverse matrix in a compact manner that won't require N^2 elements. There are matrices other than sparse ones that have this property: for example, Toeplitz matrices are representable with O(N) elements. Thus the follow-up question: is it generally possible to find an approximation of a sparse matrix that would also be presentable by O(N) elements?

EDIT 2: The problem should have been stated for symmetric matrices: it appeared from an attempt to invert a sparse Hessian, and Hessian is symmetric. Moreover, in many applied problems with a proper choice of coordinates the magnitude of mixed derivatives is known never to exceed the magnitude of the 2nd derivatives over the same variable. Thus the question should be augmented with these restrictions on the input matrix:

  1. Aij = Aji

  2. For any i,j |Aij| <= |Aii|

EDIT 3: Forgetting about the 2nd condition in EDIT 2, it is too specialized to the particular problem, and doesn't belong her. Let me rephrase based only on the 1st condition, that the matrix is symmetric (which is enough to disqualify the wonderful non-diagonal JNF example by Noam).

Given that the matrix is symmetric it could be diagonalized by applying orthogonal transformations. Diagonal form is, of course, invertible in O(N) into another diagonal matrix. In order to get to the original basis we would need to inverse the diagonalization transformation by applying the transition of the matrix that performed the diagonalization of the original matrix. Intuitively, the inverse orthogonal transformation would then spread the inverse diagonal matrix over the same positions where the non-zeros of the original matrix came from when it was being diagonalized. Since the original matrix was sparse it would seem that the inverse one would too.

What's wrong with this argument?

Best Answer

Not in general. An explicit and elementary counterexample is the sparse triangular matrix with $1$'s on the diagonal and $-1$'s just above it: the inverse is the triangular matrix with every entry on or above the diagonal equal $1$.