[Math] Dominant eigenvector by looking at rows of matrix raised to a power

eigenvalues-eigenvectorslinear algebramatrices

I'm not strong in linear algebra. I encountered this thing and being curious I want to know a bit more about it. I'm playing with 3×3 real valued matrices in some graphics application, I'm developing. I can use the power method to find the dominant eigenvector. However I found that sometimes I can raise a given matrix $A$ to a power to get $A^k$ and read out the dominant eigenvector from any row. As far as I've tried it converges much faster than the power method, but it does not always work. So I'm curious if anything more precise can be said about it. Are there some properties of the matrix $A$ that make this work?

Edit: Here's an example:

$$
M = \begin{bmatrix}
3 & 2 & 6 \\
2 & 2 & 5 \\
6 & 5 & 4
\end{bmatrix}
$$
Using my example of raising to a power:
$$
\frac{1}{\min(M^{20})} M^{20} = \begin{bmatrix}
1.50758857 & 1.22783898 & 1.87589122 \\
1.22783898 & 1.00000000 & 1.52779904 \\
1.87589122 & 1.52779904 & 2.33416991
\end{bmatrix} \quad (*)
$$

Using power method starting with $\left[ 1, 1, 1 \right]$ and 20 iterations gives us:

$$
\begin{bmatrix} 0.55800727 \\ 0.45446290 \\ 0.69432799 \end{bmatrix}
$$

The answer of SciPy is (the above result multiplied by -1):

$$
\begin{bmatrix} -0.55800727 \\ -0.45446290 \\ -0.69432799 \end{bmatrix}
$$

If we multiply the rows of $(*)$ with 0.37013233, 0.45446290, and 0.29746249 respectively we get the same vector (within some limit) as using the power method.

However I was wrong about convergence: It is not the case that it converges fast, as you can see above. I raise $M$ to 20 to get some accuracy. I guess I had a bug in my code. I was trying to optimize for speed and left out the division in the power method (only applied it after iterating 10 times) and it might have changed the convergence a bit and let me to believe that raising the matrix converges faster. There's also the comment below that you need to take the initial vector into account.

I found the Wikipedia article about stochastic matrices. They are guaranteed an eigenvalue of 1 and as you raise the matrix to a power, in the limit you will get a dominant eigenvector for each row. However, I think it's highly improbable that my matrices was accidentally stochastic matrices.

Best Answer

The example matrix is: $$ M = \left( \begin{matrix} 3 & 2 & 6 \\ 2 & 2 & 5 \\ 6 & 5 & 4 \end{matrix} \right) $$

It has these eigenvalues and eigenvectors: $$ V = \left( \begin{matrix} 0.518736 & 0.647720 & 0.558007 \\ 0.462052 & -0.761559 & 0.454463 \\ -0.719320 & -0.022082 & 0.694328 \end{matrix} \right) \,\, \Lambda = \left( \begin{matrix} -3.53862 & 0 & 0 \\ 0 & 0.44394 & 0 \\ 0 & 0 & 12.09467 \end{matrix} \right) $$ so that $$ V^{-1} M V = \Lambda $$ The dominant eigenvalue $\lambda_1 = \Lambda_{33}$ is the last one and indeed the iteration given in the example seems to converge towards $v_{\lambda_1} = V e_3 = v_3$.

Note: I used GNU octave for this calculation

  M = [3,2,6;2,2,5;6,5,4]
  [V,D] = eig(M)
  inv(V) * M * V 
  e1 = eye(3)(:,1)

Further $$ \frac{M^{20}}{\min(M^{20})} = \left( \begin{matrix} \frac{1}{0.37013} v_{\lambda_1} & \frac{1}{0.45446} v_{\lambda_1} & \frac{1}{0.29746} v_{\lambda_1} \end{matrix} \right) $$ as you gave in your update.

  function [ret] = mymin(A)
    s = size(A);
    min = A(1,1);
    for i=1:s(1)
      for j=1:s(2)
        a = A(i,j);
        if (a < min)
          min = a;
        endif
      endfor
    endfor
    ret = min;
  endfunction

  M^20/mymin(M^20)

So why do we see a multiple of $v_{\lambda_1}$?

I still believe the calculation of $M^N$, especially the $k$-th column $$ M^N e_k $$

is similar to a power iteration for $M$ with start vector $e_k$: $$ r_N = \frac{M^N e_k}{||M^N e_k||} \to v_{\lambda_1} \quad (\#) $$ the factor then relates to $||M^N e_k||$. Using $(\#)$ on term $(*)$ gives $$ \frac{M^N e_k}{\min(M^N)} \to \frac{||M^N e_k||}{\min(M^N)} v_{\lambda_1} = \frac{1}{\min(M^N) \, / \, ||M^N e_k||} v_{\lambda_1} $$

Doing the calculation: $$ \frac{\min(M^{20})}{||M^{20} e_1||_2} = \frac{(M^{20})_{22}}{||M^{20} e_1||_2} = \frac{9.2657e+20}{2.5034e+21} = 0.37013 \quad (\#\#) $$ voila. Using start vectors $e_2$ and $e_3$ with $(\#\#)$ gives the other two factors.

Update: It was asked for what matrices this behaviour occurs.

IMHO it works for those matrices $M$ where $(\#)$ converges, and that is according to the Wikipedia article (and the proof given there):

  1. $M$ has an eigenvalue that is strictly greater in magnitude than its other eigenvalues. This is the case for the example, $|\lambda_1| > |\Lambda_{11}| > |\Lambda_{22}|$ and
  2. the starting vector $e_k$ has a nonzero component in the direction of an eigenvector associated with the dominant eigenvalue. True as well for the example, where the eigenvectors, especially the dominant $v_{\lambda_1}$, have no zero coordinates, so $e_k v_{\lambda_1} \ne 0$.
  3. $M$ should be diagonizable. This is the case for the example because a symmetric matrix is diagonizable.

Update: To show that $(*)$ converges iff $(\#)$ converges, it would be helpful to show that $$ m_1 \min(A) \le ||A e_k|| \le m_2 \min(A) $$ The first constant is $m_1 = 1$. The second constant does not exist, as the left side is not negative and the minimum matrix element might be negative.

So $(\#) \le (*)$ and "$\Rightarrow$" holds.

The other direction might still be true, but I don't have proof for it.

Update: If $M$ fulfills the above three conditions, the power iteration $(\#)$ converges and we have: $$ ||M^N e_k|| \approx |w_{dk}| \, \left|\lambda_1\right|^N \quad (\#\#\#) $$ and $$ M^N e_k \approx w_{dk} \, \lambda_1^N v_{\lambda_1} \quad (\$) $$ where $V^{-1} = (w_{ij})$ is the inverse of the eigenvector matrix and $d$ is the column of the dominant eigenvector $v_{\lambda_1}$ in $V$.

This follows from the proof in the Wikipedia article on power iteration (see above). It is roughly: $$ \begin{align} M^N e_k &= M^N (c_1 v_1 + \cdots + c_n v_n) \\ &= (c_1 \lambda_1^N v_1 + \cdots + c_n \lambda_n^N v_n) \\ &= c_d \lambda_d^N v_d + \lambda_d^N\sum_{k\ne d} c_k \underbrace{\left(\frac{\lambda_k}{\lambda_d}\right)^N}_{\to 0} v_k \\ &\to c_d \lambda_d^N v_d \end{align} $$ $$ e_k = c_1 v_1 + \cdots + c_n v_n = V c \iff c = V^{-1} e_k \iff c_i = w_{ik} $$ $$ ||M^N e_k|| \to ||c_d \lambda_d^N v_{\lambda_d}|| = |w_{dk}| |\lambda_d|^N $$

Equation $(\$)$ gives $$ M^k = \left( \begin{matrix} w_{d1} \lambda_1^N v_{\lambda_{1}} & w_{d2} \lambda_1^N v_{\lambda_{1}} & w_{d3} \lambda_1^N v_{\lambda_{1}} \end{matrix} \right) $$ this allows us to calculate $\min(M^k)$ directly: $$ \min(M^k) = \min_{i,j} w_{dj} \lambda_1^N v_{id} = \lambda_1^N \min_{i,j} v_{id} w_{dj} \quad (\$\$) $$ which implies that $$ \frac{M^N}{\min(M^N)} \to \\ \left( \begin{matrix} \frac{1}{(\min_{i,j} v_{id} w_{dj}) \, / w_{d1}} v_{\lambda_{1}} & \frac{1}{(\min_{i,j} v_{id} w_{dj}) \, / w_{d2}} v_{\lambda_{1}} & \frac{1}{(\min_{i,j} v_{id} w_{dj}) \, / w_{d2}} v_{\lambda_{1}} & \end{matrix} \right) \quad (\$\$\$) $$ So "$\Leftarrow$" holds as well and we have that $(*)$ converges iff $(\#)$ converges.

For the example matrix this gives: The dominant eigenvector resides at column $d = 3$ in $V$. The inverse of $V$ is: $$ V^{-1} = \left( \begin{matrix} 0.518736 & 0.462052 & -0.719320 \\ 0.647720 & -0.761559 & -0.022082 \\ 0.558007 & 0.454463 & 0.694328 \end{matrix} \right) $$ note that $V^{-1} = V^T$ here, thus $V$ is orthogonal, because $M$ is real and symmetric. Then we have $$ (v_{i3}) = \left( \begin{matrix} 0.55801 \\ 0.45446 \\ 0.69433 \\ \end{matrix} \right) \quad (w_{3j}) = \left( \begin{matrix} 0.55801 & 0.45446 & 0.69433 \end{matrix} \right) $$ and for all combinations $$ (v_{i3}) \, (w_{3j}) = \left( \begin{matrix} 0.31137 & 0.25359 & 0.38744 \\ 0.25359 & 0.20654 & 0.31555 \\ 0.38744 & 0.31555 & 0.48209 \end{matrix} \right) $$ the minimum is $0.20654$. This gives $$ \left( \frac{\min_{ij} v_{i3} \, w_{3j}}{w_{dk}} \right) = \left( \begin{matrix} 0.37013 & 0.45446 & 0.29746 \end{matrix} \right) $$

Update: An interesting property of $(\$\$)$ is that the minimization does not depend on the iteration step $N$. Or in other words, independent of $N$, the minimum is always picked from the same element position of the matrix, here the one at row $2$, column $2$.