Shifted power iteration — while theoretically possible — is not very useful since it converges to the eigenvalue farthest away from s.
But, by using the Inverse or Shifted Inverse power iteration algorithm will yield the eigenvalue closest to s. This means by picking appropriate
shifts µ, any one eigenvalue of A can be found.
Initialize v(0) with an arbitrary vector such that $||v^{(0)}||_{2} = 1$
for k = 1, 2, . . .
Solve $Aw = v^{(k−1)}$ for $w$
$v(k) = w/||w||_{2}$
$λ(k) = [v^{(k)}]^{T}Av^{(k)}$
end
Inverse of A need not be calculated. Instead for each iteration the eigenvector can be calculated by solving a system of linear equations, which is $Aw = v^{(k-1)}$
Same applies for Shifted Inverse Power Iteration.
So, Inverse Power Iteration doesn't require to calculate inverse of the matrix after all.
We are given the matrix $A$ where
$$A = B^T B$$
and $B = L^{-1} R$ and $R$ is a tall matrix of full column rank. There is little difficulty associated with solving linear systems of the form
$$Ax = f.$$
If $B = US$ is a $QR$ factorization of $B$, then $S$ is nonsingular because $B$ has full rank. It follows that $$A = B^T B = S^T U^T U S = S^T S.$$ The problem of solving $Ax=f$ reduces to solving $$S^T y = f, \quad S x = y.$$
In short, we have the ability to do the inverse iteration, i.e., apply the power method to $A^{-1}$.
Is there a mathematical justification for such observation?
The convergence of the power method is controlled by the ratio of the largest to the second largest eigenvalue. Suppose that $A$ is symmetric with eigenvalues $\lambda_j$ that have been ordered such that
$$ |\lambda_1| > |\lambda_2| \geq |\lambda_{n-1}| \geq |\lambda_n|.$$
Let $v_j$ denote an eigenvector of $A$ with respect to $\lambda_j$ and let $x = \sum_{j=1}^n r_j v_j$ denote any starting vector. Then
$$ A^k x = \sum_{j=1}^n r_j \lambda_j^k v_j = \lambda_1^k \sum_{j=1}^n r_j \left(\frac{\lambda_j}{\lambda_1} \right)^k v_j = \lambda_1^k \left( r_1 v_1 + \sum_{j=2}^n r_j \left(\frac{\lambda_j}{\lambda_1} \right)^k v_j \right).$$
We see that if $r_1 \not = 0$ and if $|\lambda_1| \gg |\lambda_2|$, then $A^k x$ will converge rapidly towards an eigenvector of $A$ corresponding to the eigenvalue $\lambda_1$.
In the cited experiments the matrix $A$ is replaced with $A - \lambda_\max I$. In the "bad" case, $A$ is symmetric positive definite with eigenvalues $1$, $10^{-2}$ and $10^{-k}$ for $k \in \{3,5,7\}$. The power method functions well for the matrix $A$ because $$10^{-2} \ll 1.$$ The shift ruins everything because the shifted matrix $A-I$ has eigenvalues $0$, $-0.99$ and $-1+10^{-5}$. Convergence is slow because $0.99 \approx 1-10^{-k}$.
How can I modify this algorithm for better numerical stability?
The algorithm suffers from subtractive cancellation in the final operation that is supposed to correct for the initial shift. If we are limited to working precision, then there is nothing that can be done to recover from that.
If the algorithm can not be improved, can someone suggest another efficient algorithm?
If we want faster convergence, then there are at least three distinct choices
- The inverse iteration, i.e., the power method applied to $A^{-1}$.
- The inverse subspace iteration with Ritz acceleration which is a generalization of the power method. There is a good discussion of this algorithm in the standard textbook "Matrix Computation" by Gene Golub and Charles van Loan. They call it orthogonal iteration.
- The trace minimization algorithm (TraceMin) by Sameh and Wisniewski. This is a link to the original paper.
In all three cases, we do not have to explicitly form the matrix $A^{-1}$, but we need the ability to solve linear systems of the form $Ax=f$.
If no better algorithm exists, any lower bound for the smallest eigenvalue?
This is by far the hardest of the four questions. There is good convergence theory for the algorithms cited above and you can get estimates from this theory and your intermediate results. It is an open problem to determine a good lower bound for the smallest eigenvalue of a general symmetric positive definite matrix in using a reasonable amount of time, storage and energy. If it could be done, then the practical implications would be enormous.
Best Answer
There is a really good exposition of the Arnoldi Method given by Prof. Gilbert Strang in his Video lectures found in MIT Open Course Ware.
Here is the link to the lecture where he describes Arnoldi method:
http://ocw.mit.edu/courses/mathematics/18-086-mathematical-methods-for-engineers-ii-spring-2006/video-lectures/lecture-18-krylov-methods-multigrid-continued/