[Math] PageRank (power iteration method) convergence rate

linear algebramarkov chainsnumerical linear algebra

I could not get my head around the idea that the second eigenvalue is the convergence rate.

Since the matrix in this application is a Markov matrix (rows/columns sum to one), the largest eigenvalue is 1 and the absolute value of the rest are less than one. Therefore, as the power increases all eigenvalues less than one vanish and the corresponding eigenvectors disappear.

So how can the second eigenvalue alone can represent the convergence rate of the algorithm?

Best Answer

The idea is this, you start with a vector $x$ and write it out in the eigenvalue basis of the PageRank matrix. Suppose that $1=\lambda_{1}> |\lambda_2|\geq ...\geq |\lambda_{n}|$. We write

$$x=a_1 v_{1} + a_2 v_{2} +... + a_{n}v_{n}$$

Now multiplying by $A$ (both sides $k$ times) we have:

\begin{align*} A^{k}x &= \lambda_{1}^{k} a_{1}v_{1} + \lambda_{2}^{k}a_{2}v_{2} + ... + \lambda_{n}^{k}a_{n}v_{n} \\ &= a_{1}v_{1} + \lambda_{2}^{k}a_{2}v_{2} + ... + \lambda_{n}^{k}a_{n}v_{n} \\ \end{align*}

Note that the tail of that sum is going to zero since \begin{align*} |\lambda_{2}^{k}a_{2}v_{2} + ... + \lambda_{n}^{k}a_{n}v_{n}| &\leq |\lambda_{2}|^{k}\cdot|a_{n}v_{2}| + |\lambda_{3}|^{k}\cdot|a_{3}v_{3}|+...+|\lambda_{n}|^{k}\cdot|a_{n}v_{n}| \\ &\leq |\lambda_{2}|^{k}\cdot (|a_{2}v_2| + ... + |a_{n}v_{n}|) \end{align*} Everything in parenthesis on the end there is a constant but the $|\lambda_{2}|^{k}$ is going to zero since $|\lambda_{2}|<1$. We cannot replace $\lambda_{2}$ with another eigenvalue because it is the largest (in absolute value) that is smaller than 1.

What's left after the tail goes to zero is a constant times the largest eigenvector (the thing PageRank is trying to compute). A good article that explains how PageRank works (and why the iterative method works) is: http://www.ams.org/samplings/feature-column/fcarc-pagerank It is a well-written piece that is easy to read.