[Math] the relationship between eigenvector and computing PageRank

eigenvalues-eigenvectorsfixed-point-theoremsmarkov chainsmatricesrandom walk

I read several papers about PageRank and didn't get stuck in understanding the idea of PageRank because it is simple, but I got stuck in computing PageRank, those papers talk about some mathematical topics and terms have not been used in Page and Brin's paper that discussed the PageRank formula "The Anatomy of a Large-Scale Hypertextual Web Search Engine", for example, they didn't mention in their paper anything about Markov chain, Fixed point theorem, power method, random walk, stochastic matrices, strongly connected graphs ..etc. They was clear and simple in mentioning how they solved the system of PageRank, here is a quote from their paper:

" PageRank or PR(A) can be calculated using a simple iterative
algorithm, and corresponds to the principal eigenvector of the
normalized link matrix of the web. " .

Actually I didn't understand this quotation from their paper, and here are my questions:

  • What is the meaning of this quotation?
  • What is the relationship between eigenvector and computing PageRank?
  • How can eigenvector solve a linear system?
  • And why not to use
    the traditional ways for solving a linear system?

My simple understanding for how to compute PageRank is to start initially by giving any page a PR value equals to 1 – or give some pages PR values so that the sum of them is equal to one – and then divide this value between all pages that have links from the initial page, then those pages have got updated PR values came from the initial page in the first step, then use those values for the next step to get an updated version of PR for all pages again, and repeat that until the difference between the PR values in the last step and the PR values for the current step is approximately equal to zero. It's intuitively clear that every time we do this we quickly converge to some values because PR is a probability value between 0 and 1. But the problem in my talk is that it is not a scientific and proven talk, it's just an intuition and guess, and unfortunately doesn't contain a single word about eigenvector.

Best Answer

Remark: You can consider the following answer as a comment, rather than an actual answer.

Let me give you a simple example of how PageRank (in its initial form) works. Suppose that we have a web consisting of $3$ webpages, with links between them. For our example, let' say that we have the adjacency matrix

$$L = \begin{bmatrix} 0 & 1 & 0\\ 1& 0& 1 \\ 1 & 1 & 1 \end{bmatrix}.$$ Notice that $L_{ij} = \begin{cases} 1, &i\rightarrow j \\ 0, & \text{otherwise}. \end{cases}$

We create the stochastic matrix $P$ which follows from $L$ if we divide the element of each row by the number of outlinks of each page. Thus, in our example, we have: $$P =\begin{bmatrix} 0 & 1 & 0\\ \frac 12 & 0 & \frac 12 \\ \frac 13 & \frac 13 & \frac 13 \end{bmatrix}.$$

Matrix $P$ is a stochastic matrix, which in our case is irreducible and aperiodic (irreducibility and aperiodicity can be found here). Thus, there is a normalized stationary distribution $\pi$ , satisfying the condition $\pi \cdot P =\pi$, which means $\pi$ is an eigenvector associated to the dominant eigenvalue $\lambda_1=1$ of the matrix $P$. This dominant eigenvector is the PageRank vector. By Perron - Frobenius theorem for non - negative, irreducible matrices we know that $\lambda_1 =1 $ is a simple eigenvalue and the associated eigenvector(s) is (are) the only one with positive elements.


Now, how can we approach that $\pi$? One way to do so is through Power iteration. How Power iteration works? Actually, it can compute the dominant eigenvector of a matrix (under some assumptions) starting from (almost) any vector $x_0$ and through successive iterations. When $$\|x^{(k+1)T} - x^{(k)T}\|_1 \le \tau =10^{-\ell},$$ then we can say that we have approached our dominant eigenvector $\pi$ with a very good accuracy (considering $\ell =8,9,\ldots$).


Certainly, there are many other ways to compute the dominant eigenvector. So, why Power Method? The answer is simple. It is the fastest algorithm, when we have to deal with very large scaled webs (the adjacency matrix is extremely sparse).


If you are interested in PageRank, I could recommend you a very good book (e - book):

A. Langville & C. Meyer, "Google's PageRank and Beyond: The Science of Search Engine Rankings", Princeton University Press.

Related Question