The starting vectors in most of the eigenvalue algorithms are initialized with a random vector with entries $X \sim N(0,1)$ If you look in Trefethen. You should note that they randomly initialize most of the algorithms in numerical linear algebra.
It says "arbitrary" but what it means it is random. Some code follows after an explanation. The Lanzcos algorithm is in a suite of algorithms called the Krylov methods. The Krylov methods are a specific type of iterative eigenvalue algorithm as opposed to direct methods. The Krylov methods are listed below.
To explain how Lanczos works I am going to explain how Arnoldi works. The basis of both of these eigenvalue methods is first to use a similarity transform reducing to Hessenberg form. That is the following
$$A = QHQ^{*} $$
or
$$AQ = QH $$
Instead, we compute $Q_{n} $ letting it be a $m \times n$ matrix whose first $n$ columns are of $Q$
Additionally, we have
In the book it calls it Hessenberg matrix however my professor said it is "hessenbergish". We end up with
$$ A Q_{n} = Q_{n+1} \bar{H_{n+1}} $$
The idea is simple that we are actually building Krylov subspaces
$$ \mathbf{K_{n}} = \langle b, Ab, \cdots, A^{n-1}b \rangle = \langle q_{1} ,q_{2} , \cdots q_{n} \rangle $$
We form these throught the QR decomp
$$ K_{n} = Q_{n}R_{n}$$
Now the we had $H_{n+1}$ to remove this row we note the following. $ Q_{n}^{*}Q_{n+1}$ is the $n \times (n+1)$ identity then $Q_{n}^{*}Q_{n+1}\bar{H_{n}}$ is $ n \times n$ Hessenberg matrix obtained by removing the last row.
$$ H_{n} = Q_{n}^{*}AQ_{n} $$
Now if you see we are iteratively building our projections through these Krylov subspaces.
The Lanczos iteration is simply when the matrix is symmetric.
function [T varargout] = myLanczos(mat,varargin)
%--------------------------------------------------------------------------
% Syntax: T = myLanczos(mat);
% T = myLanczos(mat,k);
% T = myLanczos(mat,v);
% T = myLanczos(mat,k,v);
% [T Q] = myLanczos(mat);
% [T Q] = myLanczos(mat,k);
% [T Q] = myLanczos(mat,v);
% [T Q] = myLanczos(mat,k,v);
%
% Inputs: mat is a symmetric or Hermitian N x N matrix
%
% k is the number of Lanczos iterations to perform. The
% default value is N
%
% v is an N x 1 vector (internally forced to have unit norm)
% specifying a particular starting vector to use. The default
% vector is chosen uniformly at random from the unit sphere
%
% Outputs: T is a k x k symmetric tridiagonal matrix. When k < N, T is
% approximately similar to mat up to rank k. When k = N, T is
% similar (in the linear algebra sense) to mat. That is, when
% k = N, T has identical eigenvalues to mat
%
% Q is the N x k similarity transformation such that
% mat = Q * T * Q'. Note that equality holds only when k = N.
% For k < N, mat ~ Q * T * Q'
%
% Description: This function uses the Lanczos algorithm with full
% reorthogonalization to compute k x k symmetric tridiagonal
% matrix T that approximates mat up to rank k with respect to
% transformation Q. That is, mat = Q * T * Q'. Note that
% equality holds only for k = N. For k < N, mat ~ Q * T * Q'
%
% NOTE: In particular, when k = N, T has identical
% eigenvalues to mat
%
% Author: Brian Moore
% [email protected]
%
% Date: July 5, 2013
%--------------------------------------------------------------------------
% Knobs
symTol = 1e-8;
% Check input matrix size
[m,n] = size(mat);
if (m ~= n)
error('Input matrix must be square');
end
% Make sure input matrix is symmetric
if max(max(abs(mat - mat'))) > symTol
error('Input matrix is not symmetric to working precision');
end
% Parse user inputs
if (nargin == 2)
if (length(varargin{1}) == n)
k = n;
v = varargin{1};
else
k = varargin{1};
v = randn(n,1);
end
elseif (nargin == 3)
k = varargin{1};
v = varargin{2};
else
k = n;
v = randn(n,1);
end
% Initialize variables
Q = nan(n,k);
q = v / norm(v);
Q(:,1) = q;
d = nan(k,1);
od = nan(k-1,1);
% Perform Lanczos iterations
for i = 1:k
z = mat * q;
d(i) = q' * z;
%----------------------------------------------
% Uncomment only one of the following 3 options
%----------------------------------------------
% Option 1: Full re-orthogonalization
%z = z - Q(:,1:i) * (Q(:,1:i)' * z);
% Option 2: Full re-orthogonalization (x2)
z = z - Q(:,1:i) * (Q(:,1:i)' * z);
z = z - Q(:,1:i) * (Q(:,1:i)' * z);
% Option 3: No re-orthogonalization
%z = z - d(i) * q;
%if (i > 1)
% z = z - od(i-1) * Q(:,i-1);
%end
%----------------------------------------------
if (i ~= k)
od(i) = norm(z);
q = z / od(i);
Q(:,i + 1) = q;
end
end
% Construct T
T = diag(d) + diag(od,-1) + diag(od,1);
% Return user-requested information
if (nargout == 2)
varargout{1} = Q;
end
if you note if v is not specified it is these lines
else
k = n;
v = randn(n,1);
end
% Initialize variables
Q = nan(n,k);
q = v / norm(v);
It is constructing an orthogonal eigenvector. $q_{1}$ and each one after that from it.
Best Answer
At first I describe the connection of the QR algorithm with the power method and the inverse power method. As I understand this is the main topic you are interested in. Later -- when I have a little more time -- I will extent this to subspace iteration. I think that is what the second part of your question is about.
Keeping the discussion simple, let $A\in\mathbb{C}^{n\times n}$ be a regular complex square matrix with eigenvalues having pairwise distinct absolute values.
The QR algorithm without shift is defined by the iteration \begin{align*} &\text{Start}&A_1 &:= A\\ &\text{QR-decomposition}& Q_i R_i &:= A_i &&@i=1,\ldots\\ &\text{rearranged new iterate}&A_{i+1} &:= R_i Q_i \end{align*} Representing $R_i$ as $R_i = Q_i^H A_i$ and substituting this into the formula for $A_{i+1}$ gives $ A_{i+1} = Q_i^H A_i Q_i$. Thus, the matrix $A_{i+1}$ is similar to $A_i$ and has the same eigenvalues. Defining the combined orthogonal transformation $\bar Q_i := Q_1\cdot\ldots\cdot Q_i$ for $i=1,\ldots$ we obtain \begin{align*} A_{i+1} &= \bar Q_i^H A \bar Q_i&&@ i=1,\ldots \end{align*} or $A_i = \bar Q^H_{i-1} A \bar Q_{i-1}$ for $i=2,\ldots$. We substitute $A_i$ in the above QR-decomposition with this formula and obtain \begin{align*} Q_i R_i &= A_i = \bar Q_{i-1}^H A \bar Q_{i-1}\\ \bar Q_i R_i &= A \bar Q_{i-1}, \end{align*} using the definition of $\bar Q_{i-1}Q_i = \bar Q_i$ in the second equality. Taking the first column on both sides of the last equation one gets \begin{align*} \bar Q_i(:,1)\cdot R_i(1,1) = A\cdot \bar Q_{i-1}(:,1) \end{align*} (with Octave notation of the matrix elements) which shows that $R_i(1,1)$ and $\bar Q_i(:,1)$ are just the approximations for the eigenvalue and the eigenvector, respectively, in the power-iteration for the matrix $A$. The elements $R_i(1,1)$ with $i=1,\ldots$ converge to largest of the eigenvalues whose eigenvectors belong to the invariant subspace of $A$ spanned by the vectors $A^k Q_1(:,1)$ with $k=0,\ldots,n-1$.
Analogously to the above procedure we use the formula for the rearranged new iterate of the QR-algorithm to get \begin{align*} \bar Q_i^H A \bar Q_i &= A_{i+1} = R_i Q_i\\ \bar Q_i^H A &= R_i \bar Q_{i-1}^H \end{align*} The last row of the last equation results to \begin{align*} \bar Q_i^H(n,:) \cdot A &= R_i(n,n)\cdot \bar Q^H_{i-1}(n,:). \end{align*} Theoretically, one can calculate the sequences $\bar Q_i^H(n,:)$ and $R_i(n,n)$ for $i=2,\ldots$ from $\bar Q_1^H(n,:)=Q_1^H(n,:)$ by the inverse power iteration \begin{align*} &\text{Solve}& x\cdot A &= \bar Q_{i-1}^H(n,:)&&\text{for }x\in\mathbb{C}^{1,n}\\ &\text{Normalize} & \bar Q_i^H(n,:) &:= \frac{x}{|x|}\\ &\text{Set}& R_i(n,n) &:= |x| \end{align*} where $R_i(n,n)$ are the approximations for the smallest of the eigenvalues of $A$ with corresponding left-eigenvector in the subspace spanned by the vectors $Q_1^H(n,:)\cdot A^k$ with $k=0,\ldots,n-1$.