[Math] What are non-orthogonal eigenvectors

linear algebramatricesnumerical linear algebraoptimizationvector-spaces

Given a symmetric matrix $A$, the maximum of the trace, $Tr(Z^TAZ)$ under the assumption that $Z^TZ=I$ occurs when $Z$ has the eigenvectors of $A$, as $Tr(U^TAU)= \lambda_1 +\lambda_2+…\lambda_ d$ where $Z\in\mathbb{R}^{n\times d}$. I know that the eigen vectors being the solution is as a result of the Courant minimax principle.

q1) Now, I have faintly heard about non-orthogonal eigenvectors, and am very curious to know, why they are called so? Is this because they form a basis of the eigen-space of $A$ and are still non-orthogonal? What is its relation with the Courant Minimax / Courant Fischer characterization, if any?

q2) How does it fit into the trace maximization formulation given above? Especially, if there is no other constraint, when the orthogonality is dropped, doesn't the problem become unbounded or ill-posed? if so, what are these non-orthogonal eigenvectors, optimizing?

q3) If the matrix $A$ was not symmetric or non-normal, then what are its non-orthogonal eigenvectors solving for? or may be is this connected it to an SVD instead in this case?

q4) When i Google, for applications of non-orthogonal eigenvectors, i just find nothing. What are its uses!?

Please restrict your answers to be within the matrix algebra setting as much as possible, unless it strictly requires discussing it through other fields of mathematics.

Best Answer

q1) In the case of repeated eigenvalues, like you say, any eigenvectors in the repeated eigenspace could be chosen even if they are not orthogonal. You could also be hearing about nonorthogonal eigenvectors in the context of nonsymmetric matrices, which generally don't have orthogonal eigenvectors. Having orthogonal eigenvectors is one of the nicest things about symmetric matrices.

q2) If you remove the condition $Z^TZ=I$, you could make the trace as large as you want by choosing any $Z$ where the trace is positive, and scaling $Z \rightarrow c Z$ by a larger and larger constants $c$. If instead the condition is relaxed to have the columns of $Z$ be unit vectors, then the maximum would be where all of the columns of $Z$ are identically equal to the most dominant eigenvector (eigenvector with the largest eigenvalue). If you make the columns of $Z$ all orthonormal, then that is equivalent to the original constraint $Z^TZ=I$.

q3) Not sure what this is asking. They are just the eigenvectors. Perhaps you are asking if there is a variational characterization of eigenvalues of nonsymmetric matrices? I don't know of one but there might be. The singular vectors and values always have such a variational characterization and it is related, but for nonsymmetric matrices the singular value decomposition is a different thing from the eigenvalue decomposition.

q4) Uses of the eigenvalue decomposition, symmetric or not, are hugely useful in more ways than I can post. They can be used when you want to apply the matrix again and again and again without doing the matrix product each time. They characterize the sensitivity of the matrix's action to different inputs - the dominant eigenvector is the most sensitive one, eigenvectors with smaller eigenvalues are correspondingly less sensitive. In basically any application where matrices are, the eigenvalues and eigenvectors are going to represent something important or interesting. In the nonorthogonal case, the minimum angle between eigenvectors characterizes how close the matrix is to being singular, which is important in many nummerical methods.