[Math] “Rank-K Correction” of a matrix and significance

Today my studies led me to read about the matrix inversion lemma, which Wikipedia introduces as follows:

In mathematics (specifically linear algebra), the Woodbury matrix identity, named after Max A. Woodbury[1][2] says that the inverse of a rank-k correction of some matrix can be computed by doing a rank-k correction to the inverse of the original matrix.

After some searching I couldn't find any explanation of the terminology rank-$k$ correction, but I did find a Wikipedia article about the BFGS algorithm that uses the same term:

Instead, the Hessian matrix is approximated using rank-one updates specified by gradient evaluations… the approximate Hessian at stage $k$ is updated by the addition of two matrices $$B_{k+1} = B_k + U_k + V_k$$ Both $U_k$ and $V_k$ are symmetric rank-one matrices… [and together] construct a rank-two update matrix…

As far as I can tell from these sources, a rank-$k$ correction of matrix $A$ consists of the addition of a matrix $B$ of rank $k$ to $A$.

However, it is unclear to me what exactly the reason is for describing matrix addition in this way. Can someone help me understand the implications and applications of this terminology at a basic level, as well as its importance to optimization problems?

Best Answer

Rank updates to a matrix are often performed when new data is acquired and needs to be incorporated into a given model.

For example, in the case of regression, in order to compute the coefficient estimates, $\beta$, of linear model, the Normal equations say that we can compute them as $\hat{\beta} = (X^{T}X)^{g}X^{T}y$. If we assume that $X \in R^{n \times p}$ has full column-rank, then $(X^{T}X)^{g} = (X^{T}X)^{-1}$. Now let's say we observe some new samples $x_{n+1},x_{n+2},x_{n+3}... \in R^{p}$, we can define $X^{*} = \begin{bmatrix} X \\ x_{n+1} \\ x_{n+2} \\ ... \end{bmatrix}$. Rather than explicitly computing $(X^{*T}X^{*})^{-1}$, which might be costly if $p$ is large, we can make use of the matrix inversion lemma to simply update $(X^{T}X)^{-1}$ (which we already have stored somewhere from fitting our original model) with our new samples.

Best Answer

Related Solutions

The relationship between matrix rank and its generalized inverse rank..

Show an objective function is convex with respect to multiple matrix variables

Related Question