Conjugate Gradient and Krylov Subspace Method

numerical linear algebranumerical methodsnumerical optimizationreal-analysis

In HF, on each iteration the CG algorithm is used to approximately compute
$$
\mathbf{d}=-(\mathbf{H}+\lambda \mathbf{I})^{-1} \mathbf{g}
$$

where $\mathrm{d}$ is the step direction, and $\mathrm{g}$ is the gradient. As described in Martens (2010), CG aims to minimize the function $\frac{1}{2} \mathbf{x}^{T}(\mathbf{H}+\lambda \mathbf{I}) \mathbf{x}-\mathbf{x}^{T} \mathbf{g}$ which is a quadratic
approximation of our objective function. The approximate solution $\mathbf{d}_{C G}$ reached after $K$ iterations of CG will lie in the Krylov subspace of dimension $K$ spanned by $\left\{\mathbf{g},(\mathbf{H}+\lambda \mathbf{I}) \mathbf{g}, \ldots,(\mathbf{H}+\lambda \mathbf{I})^{K-1} \mathbf{g}\right\}$. This is easy to
see by looking at the CG algorithm.

I am not able to see that why $d$ is in span of $\left\{\mathbf{g},(\mathbf{H}+\lambda \mathbf{I}) \mathbf{g}, \ldots,(\mathbf{H}+\lambda \mathbf{I})^{K-1} \mathbf{g}\right\}$. in order to show that $\mathbf{d}_{C G}$ lie in Krylov Subspace.

Thank you for any help.

Best Answer

Consider $Ax=b$ with $n\times n$ nonsingular $A$. There's a polynomial $p$ of a degree $k$ ($k\leq n$) such that $p(A)=0$, namely the minimal polynomial of $A$, which we can write as $$ p(t)=\alpha_k+\alpha_{k-1}t+\cdots+\alpha_0t^k. $$ Note that $\alpha_k\neq 0$, otherwise $A$ would be singular.

Since $p(A)=0$, we have $$ \alpha_k I+\alpha_{k-1}A+\cdots+\alpha_0A^k=0. $$ Multiplying with $A^{-1}$, dividing by $\alpha_k$, and rearranging gives $$ A^{-1}=-\frac{1}{\alpha_k}\left(\alpha_{k-1}I+\cdots+\alpha_0A^{k-1}\right) $$ and hence $$ x=A^{-1}b=-\frac{1}{\alpha_k}\left(\alpha_{k-1}b+\cdots+\alpha_0A^{k-1}b\right)\in\mathrm{span}\{b,Ab,\ldots,A^{k-1}b\}. $$

Related Question