[Math] Finding a similarity transform for a matrix that minimizes the (2-norm) condition number

linear algebramatrices

I'm working with matrices that have large condition numbers, and I was wondering if there's a way to find a similarity transform $B = PAP^{-1}$ such that $B$ has a smaller 2-norm condition number than $A$ (or better yet: find a matrix with the minimum condition number).

I don't think this is the case, but I can't seem to find any references showing that condition number is a similarity invariant.

For example (in Python):

>>> import numpy as np
>>>
>>> A1 = np.matrix([[1.2, -1],[3, 1]])
>>> H2 = np.matrix([[1, 0.5],[0.5, 1.0/3]])
>>> A2 = H2.I * A1 * H2
>>> A3 = H2 * A1 * H2.I
>>> for M in [A1, A2, A3]:
...     print np.linalg.cond(M,2)
...
2.57329845919
541.643656413
207.757091448

The matrices $A_1, A_2, A_3$ are all similar (using the Hilbert matrix as a similarity transformation) but have different condition numbers, so 2-norm condition number definitely isn't a similarity invariant, but I don't know how to go about reducing it.

Best Answer

Note that $||A||_2=\sigma_1$ where ${\sigma_1}$ is a greatest singular value of $A$. Moreover $|\lambda_1|\leq \sigma_1$ where $\lambda_1$ is an eigenvalue of $A$ with greatest modulus. Thus, if $B$ is similar to $A$ (over $\mathbb{C}$), then $||B||_2\geq |\lambda_1|$.

Assume that $A$ is diagonalizable (over $\mathbb{C}$): $A=PDP^{-1}$. Then $||D||_2=|\lambda_1|$, that is the minimum of the $2$-norm of a matrix that is similar to $A$.

EDIT. The general case. Proposition. Let $A\in M_n(\mathbb{C})$ and $\epsilon >0$. Then there is $B\in M_n(\mathbb{C})$ that is similar to $A$ and $\lambda_1\leq ||B||_2<\lambda_1+\epsilon$.

Proof. According to Jordan theory, there is a sequence $(B_k)_k$ s.t. $B_k$ is similar to $A$ and $\lim_k B_k=diag(\lambda_i)$. When $k\rightarrow \infty$, $||B_k||_2\rightarrow ||D||_2=|\lambda_1|$ and we are done.

Conjecture. (With the previous notations.) There is $B$ s.t. $||B||_2=|\lambda_1|$ iff $\lambda_1$ is a semisimple eigenvalue and is the unique eigenvalue with maximal modulus.

EDIT. Instead of removing the green chevron, you should think for a moment...

You can easily conclude about the condition number. $cond_2(A)=\sigma_1/\sigma_n$ and $\sigma_1\geq |\lambda_1|\geq |\lambda_n|\geq \sigma_n$. Then $cond_2(A)\geq |\lambda_1/\lambda_n|=cond_2(B)$ and we are done when $A$ is diagonalizable. Otherwise, by the reasoning above, for every $\epsilon >0$, there is $B$ s.t. $cond_2(B)<|\lambda_1/\lambda_n|+\epsilon$.