Find diagonal elements of the inverse for large matrices

inverselinear algebralinear regressionnumerical linear algebra

Consider the set of invertible matrices $\{\mathbf{C}_{n}\}_{n=1}^N$ defined as
$$\mathbf{C}_n = \begin{bmatrix} 1 & \mathbf{0}^\top_{k_1} & \mathbf{0}^\top_{k_2} &\cdots & \mathbf{0}^\top_{k_n} \\
\mathbf{0}_{k_1} & \mathbf{X}^\top_{1}\mathbf{X}_{1} + \boldsymbol{\Gamma}_1 & \mathbf{X}^\top_{1}\mathbf{X}_{2} & \cdots & \mathbf{X}^\top_{1}\mathbf{X}_{n} \\
\mathbf{0}_{k_2} & \mathbf{X}^\top_{2}\mathbf{X}_{1} & \mathbf{X}^\top_{2}\mathbf{X}_{2} + \boldsymbol{\Gamma}_2 &\cdots & \mathbf{X}^\top_{2}\mathbf{X}_{n} \\
\vdots & \vdots & \vdots & \ddots & \vdots \\
\mathbf{0}_{k_n} & \mathbf{X}^\top_{n}\mathbf{X}_{1} & \mathbf{X}^\top_{n}\mathbf{X}_{2} &\cdots & \mathbf{X}^\top_{n}\mathbf{X}_{n} + \boldsymbol{\Gamma}_n\end{bmatrix}, \quad n=1, \ldots, N,$$

where $\mathbf{0}_{k_1}$ denotes a vector of $k_1$ zeros, $\mathbf{X}_{n}\in \mathbb{R}^{m \times k_n}$, $\boldsymbol{\Gamma}_{n}\in \mathbb{R}^{k_n \times k_n}$ are singular but semidefinite positive matrices and we assume that $\mathbf{X}^\top_{n}\mathbf{X}_{n} + \boldsymbol{\Gamma}_n$ is invertible for each $n=1, \ldots, N$.

$\textbf{Objective}$: Find the diagonal elements of $\mathbf{C}_n^{-1}$, the inverse of $\mathbf{C}_n$ (assume that, for large $n$, the size of the matrix $\mathbf{C}_n$ prevents to compute the entire inverse due to numerical issues).

I managed to compute iteratively the set of matrices
$$\mathbf{M}_n = \mathbf{B}_n \mathbf{C}_n^{-1} \mathbf{B}_{n}^\top,
\qquad \text{with} \quad \mathbf{B}_n = \begin{bmatrix} \mathbf{0}_{m} & \mathbf{X}_{1} & \mathbf{X}_{2} & \cdots & \mathbf{X}_{n} \end{bmatrix}$$

for each $n=1, \ldots, N$, which we assume that are matrices much smaller than $\mathbf{C}_n$.

With them, I was able to compute the elements of the bottom right block of the inverse (and hence, its diagonal elements). Indeed, denoting as $\mathbf{F}_{n}$ the bottom right block of the inverse of $\mathbf{C}_n$, using the block matrix inverse formula we have that
\begin{equation*}
\mathbf{F}_n = (\mathbf{X}_n^\top \mathbf{X}_n + \boldsymbol{\Gamma}_n – \mathbf{X}_n^\top \mathbf{B}_{n-1}\mathbf{C}^{-1}_{n-1}\mathbf{B}^\top_{n-1} \mathbf{X}_{n})^{-1} = (\mathbf{X}_n^\top \mathbf{X}_n + \boldsymbol{\Gamma}_n – \mathbf{X}_n^\top \mathbf{M}_{n-1}\mathbf{X}_{n})^{-1}.
\end{equation*}

In this computation, only matrices up to size $\max\{m, k_{n}\}$ have been involved, and thus we assume that they can be calculated precisely. I tried to compute similarly the other blocks, but I was not able to find a pattern, and bigger matrices were involved in the computation. Is there a way to find these diagonal elements precisely?

Any help would be appreciated.

Best Answer

In case it's helpful (e.g. perhaps it can be modified to work using pseudoinverses), here's an approach that works if $\Gamma_j$ for $j = 1,\dots,n$ are invertible.

First of all, note that for the block matrix $$ A = \pmatrix{1 & 0_k^T\\0_k & B} $$ $A$ is invertible iff $B$ is invertible with inverse given by $$ A^{-1} = \pmatrix{1&0_k^T\\0_k&B^{-1}}. $$ So, we can safely restrict our attention to the submatrix $$ \mathbf{D}_n = \begin{bmatrix} \mathbf{X}^\top_{1}\mathbf{X}_{1} + \boldsymbol{\Gamma}_1 & \mathbf{X}^\top_{1}\mathbf{X}_{2} & \cdots & \mathbf{X}^\top_{1}\mathbf{X}_{n} \\ \mathbf{X}^\top_{2}\mathbf{X}_{1} & \mathbf{X}^\top_{2}\mathbf{X}_{2} + \boldsymbol{\Gamma}_2 &\cdots & \mathbf{X}^\top_{2}\mathbf{X}_{n} \\ \vdots & \vdots & \ddots & \vdots \\ \mathbf{X}^\top_{n}\mathbf{X}_{1} & \mathbf{X}^\top_{n}\mathbf{X}_{2} &\cdots & \mathbf{X}^\top_{n}\mathbf{X}_{n} + \boldsymbol{\Gamma}_n\end{bmatrix}, \quad n=1, \ldots, N. $$ It is notable that we can write this matrix in the form $\mathbf D_n = \mathbf G_n + \mathbf \Xi_n^T\mathbf \Xi_n$, where $$ \mathbf G_n = \operatorname{diag}(\mathbf \Gamma_1,\dots,\mathbf \Gamma_n), \quad \mathbf \Xi_n = \pmatrix{\mathbf X_1 & \cdots & \mathbf X_n}. $$ $\mathbf G_n$ is block-diagonal, so its inverse is easily computed as $$ \mathbf G_n^{-1} = \operatorname{diag}(\mathbf \Gamma_1^{-1},\dots,\mathbf \Gamma_n^{-1}). $$ From there, we could apply the Woodbury matrix identity to get $$ \mathbf D_n^{-1} =\mathbf G_n^{-1} - \overbrace{\mathbf G_n^{-1} \mathbf \Xi^T}^{\mathbf U_n^T}\overbrace{(I + \mathbf \Xi_n \mathbf G_n^{-1}\mathbf \Xi_n^T)^{-1}}^{\mathbf Y_n}\overbrace{\mathbf \Xi_n \mathbf G_n^{-1}}^{\mathbf U_n} $$ The term $\mathbf U_n^T \mathbf Y_n \mathbf U_n^T$ requires only the computation of an $m \times m$ matrix. Note that we can rewrite $\mathbf U_n, \mathbf Y_n$ as follows. $$ \mathbf Y_n = \left(\mathbf I_m + \mathbf \Xi \mathbf G_n^{-1}\mathbf \Xi^T\right)^{-1} = \left(\mathbf I_m + \sum_{j=1}^m \mathbf X_j \mathbf \Gamma_j^{-1} \mathbf X_j^T\right)^{-1}. $$ Note that $\mathbf Y_n$ can itself be updated from $\mathbf Y_{n-1}$ using the Woodbury identity, which is especially useful if $k_n$ is small compared to $m$. We have $$ \mathbf U_n = [\mathbf X_1 \mathbf \Gamma_1^{-1} \ \ \cdots \ \ \mathbf X_n \mathbf \Gamma_n^{-1}]. $$ Thus, we have $$ \mathbf U_n^T \mathbf Y_n \mathbf U_n^T = \left[\mathbf \Gamma_i^{-T} \mathbf X_i^T\left(\mathbf I_m + \sum_{j=1}^m \mathbf X_j \mathbf \Gamma_j^{-1} \mathbf X_j^T\right)^{-1}\mathbf X_j\mathbf \Gamma_j^{-1}\right]_{i,j = 1}^n. $$


We can make this work for positive semidefinite $\mathbf \Gamma_j$ as follows. Consider the following definitions:

  • For each $j$, let the matrix $L_j$ be such that $L_jL_j^T = \mathbf \Gamma_j$. In particular, I will take $L_j$ to be the positive semidefinite square root of $\mathbf \Gamma_j$ so that $L_j$ is symmetric.
  • $X = \operatorname{diag}(L_1,\dots,L_n)$.
  • $Y = \mathbf \Xi_n^T$
  • $Z = (I - XX^+)Y$
  • $E = I - X^+Y (I - Z^+Z) F^{-1} (X^+Y)^\mathrm T$
  • $F = I + (I - Z^+Z) Y^\mathrm T (XX^\mathrm T)^+ Y (I - Z^+Z)$

We can then compute $$ \mathbf C_n^{-1} = \mathbf C_n^+ = (XX^\mathrm T + YY^\mathrm T)^+ = (ZZ^\mathrm T)^+ + (I - YZ^+)^\mathrm T X^{+\mathrm T} E X^+ (I - YZ^+). $$

Some simplifications:

  • $X^+ = \operatorname{diag}(L_1^+,\dots,L_n^+)$
  • $(XX^T)^+ = \mathbf G_n^+ = (X^+)^2 = \operatorname{diag}([L_1^+]^2,\dots,[L_n^+]^2)$