[Math] Power method converging towards non-dominant eigenvector

eigenvalues-eigenvectorslinear algebramatricesnumerical methods

I need to calculate the dominant eigenvalue of the square matrix $\begin{pmatrix}
15 & -4 & -3\\
-10 & 12 & -6\\
-20 & 4 & -2
\end{pmatrix}$ and also the corresponding eigen vector.

The most obvious choice of the method is the power method. For this, I chose the initial guess of dominant vector as $ \begin{pmatrix}
0\\0
\\1
\end{pmatrix} $. While choosing this guess of dominant eigen vector, the solution converges towards the eigenvalue $ 10 $, with corresponding eigenvector $ \begin{pmatrix}
-0.5\\-1
\\0.5
\end{pmatrix}$.

After a short check in Wolfram, I found out the dominant eigenvalue is $ 20 $. In fact if I choose the initial guess as $\begin{pmatrix}
1\\0
\\0
\end{pmatrix}$, then the method converges to the true dominant eigenvalue.

Does this suggest that the outcome of Power method is dependent on the choice of initial guess? If yes, then what is the rule of thumb that we can follow to get only the dominant eigenvalue/eigenvector?

Also, while performing power method, it is required that numerically largest value be taken out as common to produce $ X^{(n)} $ for next iteration. For instance, if I have $ AX^{(n)} = \begin{pmatrix}
15\\-20
\\10
\end{pmatrix}$, then which one of the following is the correct way of finding $ X^{(n+1)} $?

$\begin{pmatrix}
15\\-20
\\10
\end{pmatrix} = 20 \begin{pmatrix}
0.75\\-1
\\0.5
\end{pmatrix} = \lambda ^{(n+1)}X^{(n+1)}$
$\begin{pmatrix}
15\\-20
\\10
\end{pmatrix} = -20 \begin{pmatrix}
-0.75\\1
\\-0.5
\end{pmatrix} = \lambda ^{(n+1)}X^{(n+1)}$
$\begin{pmatrix}
15\\-20
\\10
\end{pmatrix} = 15 \begin{pmatrix}
1\\-1.333
\\0.667
\end{pmatrix} = \lambda ^{(n+1)}X^{(n+1)}$

Best Answer

The eigenvectors to the eigenvalues $20, -5, 10$ are the columns of $$ V=\pmatrix{ -2 & 1 & 1 \\ 1 & 2 & 2 \\ 2 & 4 & -1 } $$ and can see that $x^{[0]}=\pmatrix{0\\0\\1}$ is one-fifth of the difference of the last two columns $x^{[0]}=0.2(v_2-v_3)$. As it has no part from the first eigenspace, the power iteration will be constrained to the space spanned by the last two eigenvectors and converge towards the largest eigenvalues in that space.

If you want to almost guarantee convergence towards the eigenspaces of the largest eigenvalue, take a random vector as initial vector.

In the eigenbasis, the first coordinate is zero. You get $$ A^kx^{[0]}=0⋅v_1+0.2⋅(−5)^kv_2−0.2⋅10^kv_3 $$ which does not contain $v_1$ and thus the eigenvalue $20$ at any point. Now if you add numerical noise, that zero coefficient becomes a non-zero $\varepsilon\approx 2^{-50}$ and gets magnified with coefficient $(20/10)^k=2^k$ in the normalized iterated vectors $$ 10^{-k}⋅A^kx^{[0]}=2^k ε⋅v_1+0.2⋅(−0.5)^kv_2−0.2⋅v_3 $$ so that it builds up for about $50$ iterations (mantissa length $53$) to be equal in magnitude to the other coefficients and eventually dominates and drives out all other coefficients after about $50$ further iterations, as for $k>50$ $$ 2^{50}⋅20^{-k}⋅A^kx^{[0]}=2^{50} ε⋅v_1+0.2⋅0.5^{50}⋅(−0.25)^{k-50}⋅v_2−0.2⋅0.5^{k-50}⋅v_3. $$

Related Solutions

[Math] Power iteration sign of eigenvalue

Dima Railguner;

If I remember correctly you can determine the sign of the dominant eigenvalue by observing the xn and the xn+1 iteration. The signs will alternate when the dominant eigenvalue is negative and won't when it is positive. That is at least what some books say.

Also, my power method gets -3.09... So for at least this problem it does not have a problem with the signs.

Looks like we are not using the same method. I can give you a link to mine if you want it.

[Math] Dominant eigenvector by looking at rows of matrix raised to a power

The example matrix is: $$ M = \left( \begin{matrix} 3 & 2 & 6 \\ 2 & 2 & 5 \\ 6 & 5 & 4 \end{matrix} \right) $$

It has these eigenvalues and eigenvectors: $$ V = \left( \begin{matrix} 0.518736 & 0.647720 & 0.558007 \\ 0.462052 & -0.761559 & 0.454463 \\ -0.719320 & -0.022082 & 0.694328 \end{matrix} \right) \,\, \Lambda = \left( \begin{matrix} -3.53862 & 0 & 0 \\ 0 & 0.44394 & 0 \\ 0 & 0 & 12.09467 \end{matrix} \right) $$ so that $$ V^{-1} M V = \Lambda $$ The dominant eigenvalue $\lambda_1 = \Lambda_{33}$ is the last one and indeed the iteration given in the example seems to converge towards $v_{\lambda_1} = V e_3 = v_3$.

Note: I used GNU octave for this calculation

  M = [3,2,6;2,2,5;6,5,4]
  [V,D] = eig(M)
  inv(V) * M * V 
  e1 = eye(3)(:,1)

Further $$ \frac{M^{20}}{\min(M^{20})} = \left( \begin{matrix} \frac{1}{0.37013} v_{\lambda_1} & \frac{1}{0.45446} v_{\lambda_1} & \frac{1}{0.29746} v_{\lambda_1} \end{matrix} \right) $$ as you gave in your update.

  function [ret] = mymin(A)
    s = size(A);
    min = A(1,1);
    for i=1:s(1)
      for j=1:s(2)
        a = A(i,j);
        if (a < min)
          min = a;
        endif
      endfor
    endfor
    ret = min;
  endfunction

  M^20/mymin(M^20)

So why do we see a multiple of $v_{\lambda_1}$?

I still believe the calculation of $M^N$, especially the $k$-th column $$ M^N e_k $$

is similar to a power iteration for $M$ with start vector $e_k$: $$ r_N = \frac{M^N e_k}{||M^N e_k||} \to v_{\lambda_1} \quad (\#) $$ the factor then relates to $||M^N e_k||$. Using $(\#)$ on term $(*)$ gives $$ \frac{M^N e_k}{\min(M^N)} \to \frac{||M^N e_k||}{\min(M^N)} v_{\lambda_1} = \frac{1}{\min(M^N) \, / \, ||M^N e_k||} v_{\lambda_1} $$

Doing the calculation: $$ \frac{\min(M^{20})}{||M^{20} e_1||_2} = \frac{(M^{20})_{22}}{||M^{20} e_1||_2} = \frac{9.2657e+20}{2.5034e+21} = 0.37013 \quad (\#\#) $$ voila. Using start vectors $e_2$ and $e_3$ with $(\#\#)$ gives the other two factors.

Update: It was asked for what matrices this behaviour occurs.

IMHO it works for those matrices $M$ where $(\#)$ converges, and that is according to the Wikipedia article (and the proof given there):

$M$ has an eigenvalue that is strictly greater in magnitude than its other eigenvalues. This is the case for the example, $|\lambda_1| > |\Lambda_{11}| > |\Lambda_{22}|$ and
the starting vector $e_k$ has a nonzero component in the direction of an eigenvector associated with the dominant eigenvalue. True as well for the example, where the eigenvectors, especially the dominant $v_{\lambda_1}$, have no zero coordinates, so $e_k v_{\lambda_1} \ne 0$.
$M$ should be diagonizable. This is the case for the example because a symmetric matrix is diagonizable.

Update: To show that $(*)$ converges iff $(\#)$ converges, it would be helpful to show that $$ m_1 \min(A) \le ||A e_k|| \le m_2 \min(A) $$ The first constant is $m_1 = 1$. The second constant does not exist, as the left side is not negative and the minimum matrix element might be negative.

So $(\#) \le (*)$ and "$\Rightarrow$" holds.

The other direction might still be true, but I don't have proof for it.

Update: If $M$ fulfills the above three conditions, the power iteration $(\#)$ converges and we have: $$ ||M^N e_k|| \approx |w_{dk}| \, \left|\lambda_1\right|^N \quad (\#\#\#) $$ and $$ M^N e_k \approx w_{dk} \, \lambda_1^N v_{\lambda_1} \quad (\$) $$ where $V^{-1} = (w_{ij})$ is the inverse of the eigenvector matrix and $d$ is the column of the dominant eigenvector $v_{\lambda_1}$ in $V$.

This follows from the proof in the Wikipedia article on power iteration (see above). It is roughly: $$ \begin{align} M^N e_k &= M^N (c_1 v_1 + \cdots + c_n v_n) \\ &= (c_1 \lambda_1^N v_1 + \cdots + c_n \lambda_n^N v_n) \\ &= c_d \lambda_d^N v_d + \lambda_d^N\sum_{k\ne d} c_k \underbrace{\left(\frac{\lambda_k}{\lambda_d}\right)^N}_{\to 0} v_k \\ &\to c_d \lambda_d^N v_d \end{align} $$ $$ e_k = c_1 v_1 + \cdots + c_n v_n = V c \iff c = V^{-1} e_k \iff c_i = w_{ik} $$ $$ ||M^N e_k|| \to ||c_d \lambda_d^N v_{\lambda_d}|| = |w_{dk}| |\lambda_d|^N $$

Equation $(\$)$ gives $$ M^k = \left( \begin{matrix} w_{d1} \lambda_1^N v_{\lambda_{1}} & w_{d2} \lambda_1^N v_{\lambda_{1}} & w_{d3} \lambda_1^N v_{\lambda_{1}} \end{matrix} \right) $$ this allows us to calculate $\min(M^k)$ directly: $$ \min(M^k) = \min_{i,j} w_{dj} \lambda_1^N v_{id} = \lambda_1^N \min_{i,j} v_{id} w_{dj} \quad (\$\$) $$ which implies that $$ \frac{M^N}{\min(M^N)} \to \\ \left( \begin{matrix} \frac{1}{(\min_{i,j} v_{id} w_{dj}) \, / w_{d1}} v_{\lambda_{1}} & \frac{1}{(\min_{i,j} v_{id} w_{dj}) \, / w_{d2}} v_{\lambda_{1}} & \frac{1}{(\min_{i,j} v_{id} w_{dj}) \, / w_{d2}} v_{\lambda_{1}} & \end{matrix} \right) \quad (\$\$\$) $$ So "$\Leftarrow$" holds as well and we have that $(*)$ converges iff $(\#)$ converges.

For the example matrix this gives: The dominant eigenvector resides at column $d = 3$ in $V$. The inverse of $V$ is: $$ V^{-1} = \left( \begin{matrix} 0.518736 & 0.462052 & -0.719320 \\ 0.647720 & -0.761559 & -0.022082 \\ 0.558007 & 0.454463 & 0.694328 \end{matrix} \right) $$ note that $V^{-1} = V^T$ here, thus $V$ is orthogonal, because $M$ is real and symmetric. Then we have $$ (v_{i3}) = \left( \begin{matrix} 0.55801 \\ 0.45446 \\ 0.69433 \\ \end{matrix} \right) \quad (w_{3j}) = \left( \begin{matrix} 0.55801 & 0.45446 & 0.69433 \end{matrix} \right) $$ and for all combinations $$ (v_{i3}) \, (w_{3j}) = \left( \begin{matrix} 0.31137 & 0.25359 & 0.38744 \\ 0.25359 & 0.20654 & 0.31555 \\ 0.38744 & 0.31555 & 0.48209 \end{matrix} \right) $$ the minimum is $0.20654$. This gives $$ \left( \frac{\min_{ij} v_{i3} \, w_{3j}}{w_{dk}} \right) = \left( \begin{matrix} 0.37013 & 0.45446 & 0.29746 \end{matrix} \right) $$

Update: An interesting property of $(\$\$)$ is that the minimization does not depend on the iteration step $N$. Or in other words, independent of $N$, the minimum is always picked from the same element position of the matrix, here the one at row $2$, column $2$.

Best Answer

Related Solutions

[Math] Power iteration sign of eigenvalue

[Math] Dominant eigenvector by looking at rows of matrix raised to a power

Related Question