[Math] Connection between power iterations and QR Algorithm

eigenvalues-eigenvectorslinear algebranumerical methods

I am seeking an intuitive understanding of why the QR Algorithm solves the symmetric eigenvalue problem. In class, and also in Golub and Van Loan, it has been suggested that there is somehow deep connection between the power method for finding the largest eigenvector. In fact, the power method can be generalized to "Orthogonal Iterations," where you repeatedly apply the matrix to an n-by-r random matrix (instead of just one random vector), and then orthogonalize at each step, which gives the top-r eigenvectors of the matrix (from which the eigenvalues can be calculated).

The QR Algorithm (with or without shifts) also gives the eigenvalues of the matrix by first making it tridiagonal, and then applying Givens rotations on both sides so that it is diagonal, and the eigenvalues are on the diagonal. Is there a connection between these two techniques? Perhaps these orthogonal Givens rotations are somehow analogous to the random matrix in the "orthogonal iterations" above?

Best Answer

At first I describe the connection of the QR algorithm with the power method and the inverse power method. As I understand this is the main topic you are interested in. Later -- when I have a little more time -- I will extent this to subspace iteration. I think that is what the second part of your question is about.

Keeping the discussion simple, let $A\in\mathbb{C}^{n\times n}$ be a regular complex square matrix with eigenvalues having pairwise distinct absolute values.

The QR algorithm without shift is defined by the iteration \begin{align*} &\text{Start}&A_1 &:= A\\ &\text{QR-decomposition}& Q_i R_i &:= A_i &&@i=1,\ldots\\ &\text{rearranged new iterate}&A_{i+1} &:= R_i Q_i \end{align*} Representing $R_i$ as $R_i = Q_i^H A_i$ and substituting this into the formula for $A_{i+1}$ gives $ A_{i+1} = Q_i^H A_i Q_i$. Thus, the matrix $A_{i+1}$ is similar to $A_i$ and has the same eigenvalues. Defining the combined orthogonal transformation $\bar Q_i := Q_1\cdot\ldots\cdot Q_i$ for $i=1,\ldots$ we obtain \begin{align*} A_{i+1} &= \bar Q_i^H A \bar Q_i&&@ i=1,\ldots \end{align*} or $A_i = \bar Q^H_{i-1} A \bar Q_{i-1}$ for $i=2,\ldots$. We substitute $A_i$ in the above QR-decomposition with this formula and obtain \begin{align*} Q_i R_i &= A_i = \bar Q_{i-1}^H A \bar Q_{i-1}\\ \bar Q_i R_i &= A \bar Q_{i-1}, \end{align*} using the definition of $\bar Q_{i-1}Q_i = \bar Q_i$ in the second equality. Taking the first column on both sides of the last equation one gets \begin{align*} \bar Q_i(:,1)\cdot R_i(1,1) = A\cdot \bar Q_{i-1}(:,1) \end{align*} (with Octave notation of the matrix elements) which shows that $R_i(1,1)$ and $\bar Q_i(:,1)$ are just the approximations for the eigenvalue and the eigenvector, respectively, in the power-iteration for the matrix $A$. The elements $R_i(1,1)$ with $i=1,\ldots$ converge to largest of the eigenvalues whose eigenvectors belong to the invariant subspace of $A$ spanned by the vectors $A^k Q_1(:,1)$ with $k=0,\ldots,n-1$.

Analogously to the above procedure we use the formula for the rearranged new iterate of the QR-algorithm to get \begin{align*} \bar Q_i^H A \bar Q_i &= A_{i+1} = R_i Q_i\\ \bar Q_i^H A &= R_i \bar Q_{i-1}^H \end{align*} The last row of the last equation results to \begin{align*} \bar Q_i^H(n,:) \cdot A &= R_i(n,n)\cdot \bar Q^H_{i-1}(n,:). \end{align*} Theoretically, one can calculate the sequences $\bar Q_i^H(n,:)$ and $R_i(n,n)$ for $i=2,\ldots$ from $\bar Q_1^H(n,:)=Q_1^H(n,:)$ by the inverse power iteration \begin{align*} &\text{Solve}& x\cdot A &= \bar Q_{i-1}^H(n,:)&&\text{for }x\in\mathbb{C}^{1,n}\\ &\text{Normalize} & \bar Q_i^H(n,:) &:= \frac{x}{|x|}\\ &\text{Set}& R_i(n,n) &:= |x| \end{align*} where $R_i(n,n)$ are the approximations for the smallest of the eigenvalues of $A$ with corresponding left-eigenvector in the subspace spanned by the vectors $Q_1^H(n,:)\cdot A^k$ with $k=0,\ldots,n-1$.

Related Solutions

[Math] Power iteration sign of eigenvalue

Dima Railguner;

If I remember correctly you can determine the sign of the dominant eigenvalue by observing the xn and the xn+1 iteration. The signs will alternate when the dominant eigenvalue is negative and won't when it is positive. That is at least what some books say.

Also, my power method gets -3.09... So for at least this problem it does not have a problem with the signs.

Looks like we are not using the same method. I can give you a link to mine if you want it.

The starting vector $x_1$ for construction of symmetric tridiagonal matrix using Lanczos algorithm

The starting vectors in most of the eigenvalue algorithms are initialized with a random vector with entries $X \sim N(0,1)$ If you look in Trefethen. You should note that they randomly initialize most of the algorithms in numerical linear algebra.

It says "arbitrary" but what it means it is random. Some code follows after an explanation. The Lanzcos algorithm is in a suite of algorithms called the Krylov methods. The Krylov methods are a specific type of iterative eigenvalue algorithm as opposed to direct methods. The Krylov methods are listed below.

To explain how Lanczos works I am going to explain how Arnoldi works. The basis of both of these eigenvalue methods is first to use a similarity transform reducing to Hessenberg form. That is the following

$$A = QHQ^{*} $$

or $$AQ = QH $$

Instead, we compute $Q_{n} $ letting it be a $m \times n$ matrix whose first $n$ columns are of $Q$

Additionally, we have

In the book it calls it Hessenberg matrix however my professor said it is "hessenbergish". We end up with

$$ A Q_{n} = Q_{n+1} \bar{H_{n+1}} $$

The idea is simple that we are actually building Krylov subspaces

$$ \mathbf{K_{n}} = \langle b, Ab, \cdots, A^{n-1}b \rangle = \langle q_{1} ,q_{2} , \cdots q_{n} \rangle $$

We form these throught the QR decomp

$$ K_{n} = Q_{n}R_{n}$$

Now the we had $H_{n+1}$ to remove this row we note the following. $ Q_{n}^{*}Q_{n+1}$ is the $n \times (n+1)$ identity then $Q_{n}^{*}Q_{n+1}\bar{H_{n}}$ is $ n \times n$ Hessenberg matrix obtained by removing the last row.

$$ H_{n} = Q_{n}^{*}AQ_{n} $$

Now if you see we are iteratively building our projections through these Krylov subspaces.

The Lanczos iteration is simply when the matrix is symmetric.

function [T varargout] = myLanczos(mat,varargin)
%--------------------------------------------------------------------------
% Syntax:       T = myLanczos(mat);
%               T = myLanczos(mat,k);
%               T = myLanczos(mat,v);
%               T = myLanczos(mat,k,v);
%               [T Q] = myLanczos(mat);
%               [T Q] = myLanczos(mat,k);
%               [T Q] = myLanczos(mat,v);
%               [T Q] = myLanczos(mat,k,v);
%               
% Inputs:       mat is a symmetric or Hermitian N x N matrix
%
%               k is the number of Lanczos iterations to perform. The
%               default value is N
%
%               v is an N x 1 vector (internally forced to have unit norm)
%               specifying a particular starting vector to use. The default
%               vector is chosen uniformly at random from the unit sphere
%
% Outputs:      T is a k x k symmetric tridiagonal matrix. When k < N, T is
%               approximately similar to mat up to rank k. When k = N, T is
%               similar (in the linear algebra sense) to mat. That is, when
%               k = N, T has identical eigenvalues to mat
%
%               Q is the N x k similarity transformation such that
%               mat = Q * T * Q'. Note that equality holds only when k = N.
%               For k < N, mat ~ Q * T * Q'
%
% Description:  This function uses the Lanczos algorithm with full
%               reorthogonalization to compute k x k symmetric tridiagonal
%               matrix T that approximates mat up to rank k with respect to
%               transformation Q. That is, mat = Q * T * Q'. Note that
%               equality holds only for k = N. For k < N, mat ~ Q * T * Q'
%
%               NOTE: In particular, when k = N, T has identical
%               eigenvalues to mat
%
% Author:       Brian Moore
%               [email protected]
%
% Date:         July 5, 2013
%--------------------------------------------------------------------------

% Knobs
symTol = 1e-8;

% Check input matrix size
[m,n] = size(mat);
if (m ~= n)
    error('Input matrix must be square');
end

% Make sure input matrix is symmetric
if max(max(abs(mat - mat'))) > symTol
    error('Input matrix is not symmetric to working precision');
end

% Parse user inputs
if (nargin == 2)
    if (length(varargin{1}) == n)
        k = n;
        v = varargin{1};
    else
        k = varargin{1};
        v = randn(n,1);
    end
elseif (nargin == 3)
        k = varargin{1};
        v = varargin{2};
else
    k = n;
    v = randn(n,1);
end

% Initialize variables
Q = nan(n,k);
q = v / norm(v);
Q(:,1) = q;
d = nan(k,1);
od = nan(k-1,1);

% Perform Lanczos iterations
for i = 1:k
    z = mat * q;
    d(i) = q' * z;

    %----------------------------------------------
    % Uncomment only one of the following 3 options
    %----------------------------------------------
    % Option 1: Full re-orthogonalization
    %z = z - Q(:,1:i) * (Q(:,1:i)' * z);

    % Option 2: Full re-orthogonalization (x2)
    z = z - Q(:,1:i) * (Q(:,1:i)' * z);
    z = z - Q(:,1:i) * (Q(:,1:i)' * z);

    % Option 3: No re-orthogonalization
    %z = z - d(i) * q;
    %if (i > 1)
    %    z = z - od(i-1) * Q(:,i-1);
    %end
    %----------------------------------------------

    if (i ~= k)
        od(i) = norm(z);
        q = z / od(i);
        Q(:,i + 1) = q;
    end
end

% Construct T
T = diag(d) + diag(od,-1) + diag(od,1);

% Return user-requested information
if (nargout == 2)
    varargout{1} = Q;
end

if you note if v is not specified it is these lines

else
        k = n;
        v = randn(n,1);
    end

    % Initialize variables
    Q = nan(n,k);
    q = v / norm(v);

It is constructing an orthogonal eigenvector. $q_{1}$ and each one after that from it.

Best Answer

Related Solutions

[Math] Power iteration sign of eigenvalue

The starting vector $x_1$ for construction of symmetric tridiagonal matrix using Lanczos algorithm

Related Question